Abstract
The availability of corpora annotated for discourse relations is limited and discourse relation classification performance varies greatly depending on both language and domain. This is a problem for downstream applications that are intended for a language (i.e., not English) or a domain (i.e., not financial news) with comparatively low coverage for discourse annotations. In this paper, we experiment with a state-of-the-art model for discourse relation classification, originally developed for English, extend it to a multi-lingual setting (testing on Italian, Portuguese and Turkish), and employ a simple, yet effective method to mark out-of-domain training instances. By doing so, we aim to contribute to better generalization and more robust discourse relation classification performance across both language and domain.- Anthology ID:
- 2024.sigdial-1.47
- Volume:
- Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- September
- Year:
- 2024
- Address:
- Kyoto, Japan
- Editors:
- Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 554–565
- Language:
- URL:
- https://aclanthology.org/2024.sigdial-1.47
- DOI:
- 10.18653/v1/2024.sigdial-1.47
- Cite (ACL):
- Peter Bourgonje and Vera Demberg. 2024. Generalizing across Languages and Domains for Discourse Relation Classification. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 554–565, Kyoto, Japan. Association for Computational Linguistics.
- Cite (Informal):
- Generalizing across Languages and Domains for Discourse Relation Classification (Bourgonje & Demberg, SIGDIAL 2024)
- PDF:
- https://preview.aclanthology.org/ingest-2024-clasp/2024.sigdial-1.47.pdf