Named Entity Inclusion in Abstractive Text Summarization

Sergey Berezin, Tatiana Batura


Abstract
We address the named entity omission - the drawback of many current abstractive text summarizers. We suggest a custom pretraining objective to enhance the model’s attention on the named entities in a text. At first, the named entity recognition model RoBERTa is trained to determine named entities in the text. After that this model is used to mask named entities in the text and the BART model is trained to reconstruct them. Next, BART model is fine-tuned on the summarization task. Our experiments showed that this pretraining approach drastically improves named entity inclusion precision and recall metrics.
Anthology ID:
2022.sdp-1.17
Volume:
Proceedings of the Third Workshop on Scholarly Document Processing
Month:
October
Year:
2022
Address:
Gyeongju, Republic of Korea
Editors:
Arman Cohan, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Drahomira Herrmannova, Petr Knoth, Kyle Lo, Philipp Mayr, Michal Shmueli-Scheuer, Anita de Waard, Lucy Lu Wang
Venue:
sdp
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
158–162
Language:
URL:
https://aclanthology.org/2022.sdp-1.17
DOI:
Bibkey:
Cite (ACL):
Sergey Berezin and Tatiana Batura. 2022. Named Entity Inclusion in Abstractive Text Summarization. In Proceedings of the Third Workshop on Scholarly Document Processing, pages 158–162, Gyeongju, Republic of Korea. Association for Computational Linguistics.
Cite (Informal):
Named Entity Inclusion in Abstractive Text Summarization (Berezin & Batura, sdp 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.sdp-1.17.pdf
Data
SciERC