Abstract
MultiLing 2019 Headline Generation Task on Wikipedia Corpus raised a critical and practical problem: multilingual task on low resource corpus. In this paper we proposed QDAS extractive summarization model enhanced by sentence2vec and try to apply transfer learning based on large multilingual pre-trained language model for Wikipedia Headline Generation task. We treat it as sequence labeling task and develop two schemes to handle with it. Experimental results have shown that large pre-trained model can effectively utilize learned knowledge to extract certain phrase using low resource supervised data.- Anthology ID:
- W19-8904
- Volume:
- Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 17–25
- Language:
- URL:
- https://aclanthology.org/W19-8904
- DOI:
- 10.26615/978-954-452-058-8_004
- Cite (ACL):
- Wei Liu, Lei Li, Zuying Huang, and Yinan Liu. 2019. Multi-lingual Wikipedia Summarization and Title Generation On Low Resource Corpus. In Proceedings of the Workshop MultiLing 2019: Summarization Across Languages, Genres and Sources, pages 17–25, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Multi-lingual Wikipedia Summarization and Title Generation On Low Resource Corpus (Liu et al., RANLP 2019)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/W19-8904.pdf