Abstract
Discourse parsing is a crucial task in natural language processing that aims to reveal the higher-level relations in a text. Despite growing interest in cross-lingual discourse parsing, challenges persist due to limited parallel data and inconsistencies in the Rhetorical Structure Theory (RST) application across languages and corpora. To address this, we introduce a parallel Russian annotation for the large and diverse English GUM RST corpus. Leveraging recent advances, our end-to-end RST parser achieves state-of-the-art results on both English and Russian corpora. It demonstrates effectiveness in both monolingual and bilingual settings, successfully transferring even with limited second-language annotation. To the best of our knowledge, this work is the first to evaluate the potential of cross-lingual end-to-end RST parsing on a manually annotated parallel corpus.- Anthology ID:
- 2024.findings-acl.577
- Volume:
- Findings of the Association for Computational Linguistics ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand and virtual meeting
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9689–9706
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.577
- DOI:
- Cite (ACL):
- Elena Chistova. 2024. Bilingual Rhetorical Structure Parsing with Large Parallel Annotations. In Findings of the Association for Computational Linguistics ACL 2024, pages 9689–9706, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- Bilingual Rhetorical Structure Parsing with Large Parallel Annotations (Chistova, Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.577.pdf