Abstractive Document Summarization with Word Embedding Reconstruction
Jingyi You, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura
Abstract
Neural sequence-to-sequence (Seq2Seq) models and BERT have achieved substantial improvements in abstractive document summarization (ADS) without and with pre-training, respectively. However, they sometimes repeatedly attend to unimportant source phrases while mistakenly ignore important ones. We present reconstruction mechanisms on two levels to alleviate this issue. The sequence-level reconstructor reconstructs the whole document from the hidden layer of the target summary, while the word embedding-level one rebuilds the average of word embeddings of the source at the target side to guarantee that as much critical information is included in the summary as possible. Based on the assumption that inverse document frequency (IDF) measures how important a word is, we further leverage the IDF weights in our embedding-level reconstructor. The proposed frameworks lead to promising improvements for ROUGE metrics and human rating on both the CNN/Daily Mail and Newsroom summarization datasets.- Anthology ID:
- 2021.ranlp-1.178
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 1586–1596
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.178
- DOI:
- Cite (ACL):
- Jingyi You, Chenlong Hu, Hidetaka Kamigaito, Hiroya Takamura, and Manabu Okumura. 2021. Abstractive Document Summarization with Word Embedding Reconstruction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1586–1596, Held Online. INCOMA Ltd..
- Cite (Informal):
- Abstractive Document Summarization with Word Embedding Reconstruction (You et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2021.ranlp-1.178.pdf
- Data
- NEWSROOM