Abstract
We present a robust neural abstractive summarization system for cross-lingual summarization. We construct summarization corpora for documents automatically translated from three low-resource languages, Somali, Swahili, and Tagalog, using machine translation and the New York Times summarization corpus. We train three language-specific abstractive summarizers and evaluate on documents originally written in the source languages, as well as on a fourth, unseen language: Arabic. Our systems achieve significantly higher fluency than a standard copy-attention summarizer on automatically translated input documents, as well as comparable content selection.- Anthology ID:
- N19-1204
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Jill Burstein, Christy Doran, Thamar Solorio
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2025–2031
- Language:
- URL:
- https://aclanthology.org/N19-1204
- DOI:
- 10.18653/v1/N19-1204
- Cite (ACL):
- Jessica Ouyang, Boya Song, and Kathy McKeown. 2019. A Robust Abstractive System for Cross-Lingual Summarization. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 2025–2031, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- A Robust Abstractive System for Cross-Lingual Summarization (Ouyang et al., NAACL 2019)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/N19-1204.pdf
- Data
- New York Times Annotated Corpus