Abstract
Task-agnostic pretraining objectives like masked language models or corrupted span prediction are applicable to a wide range of NLP downstream tasks (Raffel et al.,2019), but are outperformed by task-specific pretraining objectives like predicting extracted gap sentences on summarization (Zhang et al.,2020). We compare three summarization specific pretraining objectives with the task agnostic corrupted span prediction pretraining in controlled study. We also extend our study to a low resource and zero shot setup, to understand how many training examples are needed in order to ablate the task-specific pretraining without quality loss. Our results show that task-agnostic pretraining is sufficient for most cases which hopefully reduces the need for costly task-specific pretraining. We also report new state-of-the-art number for two summarization task using a T5 model with 11 billion parameters and an optimal beam search length penalty.- Anthology ID:
- 2021.emnlp-main.12
- Volume:
- Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2021
- Address:
- Online and Punta Cana, Dominican Republic
- Editors:
- Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 140–145
- Language:
- URL:
- https://aclanthology.org/2021.emnlp-main.12
- DOI:
- 10.18653/v1/2021.emnlp-main.12
- Cite (ACL):
- Sascha Rothe, Joshua Maynez, and Shashi Narayan. 2021. A Thorough Evaluation of Task-Specific Pretraining for Summarization. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 140–145, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
- Cite (Informal):
- A Thorough Evaluation of Task-Specific Pretraining for Summarization (Rothe et al., EMNLP 2021)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2021.emnlp-main.12.pdf