Cross-Linguistic Situation Entity Segmentation for Discourse Analysis in Diachronic English and German Text
Hanna Schmück, Veronika Urban, Xaver Krückl, Sonja Zeman, Claudia Claridge, Annemarie Friedrich
Abstract
Situation Entity (SE) segmentation identifies clause-like discourse units focusing on verb constellations. While SE segmentation has been applied to contemporary English as a subtask of SE annotation, systematic guidelines for syntactically ambiguous constructions remain underspecified. We present principled SE segmentation guidelines for contemporary and historical varieties of English and German. Our inter-annotator agreement studies on Late Modern English (1700–1900) and New High German (1650–1900) corpora demonstrate substantial agreement. Using the existing SitEnt corpus in contemporary English, we implement a new automatic segmenter based on XLM-RoBERTa. Our evaluation examines cross-variety and cross-lingual generalization, demonstrating challenges both for human annotation efforts and in transferring segmenters trained on contemporary English to historical varieties. Our code and data are publicly available at https://github.com/coling-unia/sitent-segmenter-law2026.- Anthology ID:
- 2026.law-main.8
- Volume:
- Proceedings of the 20th Linguistic Annotation Workshop (LAW XX)
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, USA
- Editors:
- Yang Janet Liu, Luke Gessler
- Venues:
- LAW | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 95–112
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.8/
- DOI:
- Cite (ACL):
- Hanna Schmück, Veronika Urban, Xaver Krückl, Sonja Zeman, Claudia Claridge, and Annemarie Friedrich. 2026. Cross-Linguistic Situation Entity Segmentation for Discourse Analysis in Diachronic English and German Text. In Proceedings of the 20th Linguistic Annotation Workshop (LAW XX), pages 95–112, San Diego, California, USA. Association for Computational Linguistics.
- Cite (Informal):
- Cross-Linguistic Situation Entity Segmentation for Discourse Analysis in Diachronic English and German Text (Schmück et al., LAW 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl-workshops/2026.law-main.8.pdf