Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models

Anne Beyer, Sharid Loáiciga, David Schlangen


Abstract
Coherent discourse is distinguished from a mere collection of utterances by the satisfaction of a diverse set of constraints, for example choice of expression, logical relation between denoted events, and implicit compatibility with world-knowledge. Do neural language models encode such constraints? We design an extendable set of test suites addressing different aspects of discourse and dialogue coherence. Unlike most previous coherence evaluation studies, we address specific linguistic devices beyond sentence order perturbations, which allow for a more fine-grained analysis of what constitutes coherence and what neural models trained on a language modelling objective are capable of encoding. Extending the targeted evaluation paradigm for neural language models (Marvin and Linzen, 2018) to phenomena beyond syntax, we show that this paradigm is equally suited to evaluate linguistic qualities that contribute to the notion of coherence.
Anthology ID:
2021.naacl-main.328
Volume:
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:
June
Year:
2021
Address:
Online
Editors:
Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4164–4173
Language:
URL:
https://aclanthology.org/2021.naacl-main.328
DOI:
10.18653/v1/2021.naacl-main.328
Bibkey:
Cite (ACL):
Anne Beyer, Sharid Loáiciga, and David Schlangen. 2021. Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4164–4173, Online. Association for Computational Linguistics.
Cite (Informal):
Is Incoherence Surprising? Targeted Evaluation of Coherence Prediction from Language Models (Beyer et al., NAACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.328.pdf
Video:
 https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.328.mp4
Code
 AnneBeyer/coherencegym
Data
ROCStories