Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus
Kees van Deemter, Le Sun, Rint Sybesma, Xiao Li, Bo Chen, Muyun Yang
Abstract
East Asian languages are thought to handle reference differently from languages such as English, particularly in terms of the marking of definiteness and number. We present the first Data-Text corpus for Referring Expressions in Mandarin, and we use this corpus to test some initial hypotheses inspired by the theoretical linguistics literature. Our findings suggest that function words deserve more attention in Referring Expressions Generation than they have so far received, and they have a bearing on the debate about whether different languages make different trade-offs between clarity and brevity.- Anthology ID:
- W17-3532
- Volume:
- Proceedings of the 10th International Conference on Natural Language Generation
- Month:
- September
- Year:
- 2017
- Address:
- Santiago de Compostela, Spain
- Editors:
- Jose M. Alonso, Alberto Bugarín, Ehud Reiter
- Venue:
- INLG
- SIG:
- SIGGEN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 213–217
- Language:
- URL:
- https://aclanthology.org/W17-3532
- DOI:
- 10.18653/v1/W17-3532
- Cite (ACL):
- Kees van Deemter, Le Sun, Rint Sybesma, Xiao Li, Bo Chen, and Muyun Yang. 2017. Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus. In Proceedings of the 10th International Conference on Natural Language Generation, pages 213–217, Santiago de Compostela, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Investigating the content and form of referring expressions in Mandarin: introducing the Mtuna corpus (van Deemter et al., INLG 2017)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/W17-3532.pdf