The RST Continuity Corpus

Debopam Das, Markus Egg


Abstract
We present the RST Continuity Corpus (RST-CC), a corpus of discourse relations annotated for continuity dimensions. Continuity or discontinuity (maintaining or shifting deictic centres across discourse segments) is an important property of discourse relations, but the two are correlated in greatly varying ways. To analyse this correlation, the relations in the RST-CC are annotated using operationalised versions of Givón’s (1993) continuity dimensions. We also report on the inter-annotator agreement, and discuss recurrent annotation issues. First results show substantial variation of continuity dimensions within and across relation types.
Anthology ID:
2023.law-1.16
Volume:
Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Jakob Prange, Annemarie Friedrich
Venue:
LAW
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
154–165
Language:
URL:
https://aclanthology.org/2023.law-1.16
DOI:
10.18653/v1/2023.law-1.16
Bibkey:
Cite (ACL):
Debopam Das and Markus Egg. 2023. The RST Continuity Corpus. In Proceedings of the 17th Linguistic Annotation Workshop (LAW-XVII), pages 154–165, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
The RST Continuity Corpus (Das & Egg, LAW 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2023.law-1.16.pdf