Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models

Dan Iter; Kelvin Guu; Larry Lansing; Dan Jurafsky

doi:10.18653/v1/2020.acl-main.439

Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models

Dan Iter, Kelvin Guu, Larry Lansing, Dan Jurafsky

Abstract

Recent models for unsupervised representation learning of text have employed a number of techniques to improve contextual word representations but have put little focus on discourse-level representations. We propose Conpono, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences. Given an anchor sentence, our model is trained to predict the text k sentences away using a sampled-softmax objective where the candidates consist of neighboring sentences and sentences randomly sampled from the corpus. On the discourse representation benchmark DiscoEval, our model improves over the previous state-of-the-art by up to 13% and on average 4% absolute across 7 tasks. Our model is the same size as BERT-Base, but outperforms the much larger BERT-Large model and other more recent approaches that incorporate discourse. We also show that Conpono yields gains of 2%-6% absolute even for tasks that do not explicitly evaluate discourse: textual entailment (RTE), common sense reasoning (COPA) and reading comprehension (ReCoRD).

Anthology ID:: 2020.acl-main.439
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4859–4870
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.acl-main.439/
DOI:: 10.18653/v1/2020.acl-main.439
Bibkey:
Cite (ACL):: Dan Iter, Kelvin Guu, Larry Lansing, and Dan Jurafsky. 2020. Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 4859–4870, Online. Association for Computational Linguistics.
Cite (Informal):: Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models (Iter et al., ACL 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2020.acl-main.439.pdf
Video:: http://slideslive.com/38928972
Code: google-research/language
Data: COPA, GLUE, ReCoRD

PDF Cite Search Code Video Fix data