Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization
Ruipeng Jia, Xingxing Zhang, Yanan Cao, Zheng Lin, Shi Wang, Furu Wei
Abstract
In zero-shot multilingual extractive text summarization, a model is typically trained on English summarization dataset and then applied on summarization datasets of other languages. Given English gold summaries and documents, sentence-level labels for extractive summarization are usually generated using heuristics. However, these monolingual labels created on English datasets may not be optimal on datasets of other languages, for that there is the syntactic or semantic discrepancy between different languages. In this way, it is possible to translate the English dataset to other languages and obtain different sets of labels again using heuristics. To fully leverage the information of these different sets of labels, we propose NLSSum (Neural Label Search for Summarization), which jointly learns hierarchical weights for these different sets of labels together with our summarization model. We conduct multilingual zero-shot summarization experiments on MLSUM and WikiLingua datasets, and we achieve state-of-the-art results using both human and automatic evaluations across these two datasets.- Anthology ID:
- 2022.acl-long.42
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Editors:
- Smaranda Muresan, Preslav Nakov, Aline Villavicencio
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 561–570
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.42
- DOI:
- 10.18653/v1/2022.acl-long.42
- Cite (ACL):
- Ruipeng Jia, Xingxing Zhang, Yanan Cao, Zheng Lin, Shi Wang, and Furu Wei. 2022. Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 561–570, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Neural Label Search for Zero-Shot Multi-Lingual Extractive Summarization (Jia et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/naacl24-info/2022.acl-long.42.pdf
- Data
- MLSUM, WikiLingua