Abstract
This paper describes our participating system for the Shared Task on Discourse Segmentation and Connective Identification across Formalisms and Languages. Key features of the presented approach are the formulation as a clause-level classification task, a language-independent feature inventory based on Universal Dependencies grammar, and composite-verb-form analysis. The achieved F1 is 92% for German and English and lower for other languages. The paper also presents a clause-level tagger for grammatical tense, aspect, mood, voice and modality in 11 languages.- Anthology ID:
- 2021.disrpt-1.4
- Volume:
- Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021)
- Month:
- November
- Year:
- 2021
- Address:
- Punta Cana, Dominican Republic
- Editors:
- Amir Zeldes, Yang Janet Liu, Mikel Iruskieta, Philippe Muller, Chloé Braud, Sonia Badene
- Venue:
- DISRPT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33–45
- Language:
- URL:
- https://aclanthology.org/2021.disrpt-1.4
- DOI:
- 10.18653/v1/2021.disrpt-1.4
- Cite (ACL):
- Tillmann Dönicke. 2021. Delexicalised Multilingual Discourse Segmentation for DISRPT 2021 and Tense, Mood, Voice and Modality Tagging for 11 Languages. In Proceedings of the 2nd Shared Task on Discourse Relation Parsing and Treebanking (DISRPT 2021), pages 33–45, Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- Delexicalised Multilingual Discourse Segmentation for DISRPT 2021 and Tense, Mood, Voice and Modality Tagging for 11 Languages (Dönicke, DISRPT 2021)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/2021.disrpt-1.4.pdf
- Data
- DISRPT2021, Universal Dependencies