Jigsaw Pieces of Meaning: Modeling Discourse Coherence with Informed Negative Sample Synthesis

Shubhankar Singh


Abstract
Coherence in discourse is fundamental for comprehension and perception. Much research on coherence modeling has focused on better model architectures and training setups optimizing on the permuted document task, where random permutations of a coherent document are considered incoherent. However, there’s very limited work on creating “informed” synthetic incoherent samples that better represent or mimic incoherence. We source a diverse positive corpus for local coherence and propose six rule-based methods leveraging information from Constituency trees, Part-of-speech, semantic overlap and more, for “informed” negative sample synthesis for better representation of incoherence. We keep a straightforward training setup for local coherence modeling by fine-tuning popular transformer models, and aggregate local scores for global coherence. We evaluate on a battery of independent downstream tasks to assess the impact of improved negative sample quality. We assert that a step towards optimality for coherence modeling requires better negative sample synthesis in tandem with model improvements.
Anthology ID:
2024.findings-eacl.128
Volume:
Findings of the Association for Computational Linguistics: EACL 2024
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1895–1908
Language:
URL:
https://aclanthology.org/2024.findings-eacl.128
DOI:
Bibkey:
Cite (ACL):
Shubhankar Singh. 2024. Jigsaw Pieces of Meaning: Modeling Discourse Coherence with Informed Negative Sample Synthesis. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1895–1908, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Jigsaw Pieces of Meaning: Modeling Discourse Coherence with Informed Negative Sample Synthesis (Singh, Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2024.findings-eacl.128.pdf