GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation
Mingzhou Xu, Longyue Wang, Derek F. Wong, Hongye Liu, Linfeng Song, Lidia S. Chao, Shuming Shi, Zhaopeng Tu
Abstract
The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation (MT) community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU scores on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset for target evaluation on Chinese-English ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit eight advanced models on ZP translation and identify current challenges for future exploration. We release data, code, models and annotation guidelines, which we hope can significantly promote research in this field (https://github.com/longyuewangdcu/mZPRT).- Anthology ID:
- 2022.emnlp-main.774
- Volume:
- Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates
- Editors:
- Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11266–11278
- Language:
- URL:
- https://aclanthology.org/2022.emnlp-main.774
- DOI:
- 10.18653/v1/2022.emnlp-main.774
- Cite (ACL):
- Mingzhou Xu, Longyue Wang, Derek F. Wong, Hongye Liu, Linfeng Song, Lidia S. Chao, Shuming Shi, and Zhaopeng Tu. 2022. GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 11266–11278, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
- Cite (Informal):
- GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation (Xu et al., EMNLP 2022)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/2022.emnlp-main.774.pdf