Abstract
We construct Global Voices, a multilingual dataset for evaluating cross-lingual summarization methods. We extract social-network descriptions of Global Voices news articles to cheaply collect evaluation data for into-English and from-English summarization in 15 languages. Especially, for the into-English summarization task, we crowd-source a high-quality evaluation dataset based on guidelines that emphasize accuracy, coverage, and understandability. To ensure the quality of this dataset, we collect human ratings to filter out bad summaries, and conduct a survey on humans, which shows that the remaining summaries are preferred over the social-network summaries. We study the effect of translation quality in cross-lingual summarization, comparing a translate-then-summarize approach with several baselines. Our results highlight the limitations of the ROUGE metric that are overlooked in monolingual summarization.- Anthology ID:
 - D19-5411
 - Volume:
 - Proceedings of the 2nd Workshop on New Frontiers in Summarization
 - Month:
 - November
 - Year:
 - 2019
 - Address:
 - Hong Kong, China
 - Editors:
 - Lu Wang, Jackie Chi Kit Cheung, Giuseppe Carenini, Fei Liu
 - Venue:
 - WS
 - SIG:
 - Publisher:
 - Association for Computational Linguistics
 - Note:
 - Pages:
 - 90–97
 - Language:
 - URL:
 - https://aclanthology.org/D19-5411
 - DOI:
 - 10.18653/v1/D19-5411
 - Cite (ACL):
 - Khanh Nguyen and Hal Daumé III. 2019. Global Voices: Crossing Borders in Automatic News Summarization. In Proceedings of the 2nd Workshop on New Frontiers in Summarization, pages 90–97, Hong Kong, China. Association for Computational Linguistics.
 - Cite (Informal):
 - Global Voices: Crossing Borders in Automatic News Summarization (Nguyen & Daumé III, 2019)
 - PDF:
 - https://preview.aclanthology.org/ingest-acl-2023-videos/D19-5411.pdf
 - Data
 - Global Voices