Abstract
We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts—the Webis Clickbait Spoiling Corpus 2022—shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.- Anthology ID:
- 2022.acl-long.484
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 7025–7036
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.484
- DOI:
- 10.18653/v1/2022.acl-long.484
- Cite (ACL):
- Matthias Hagen, Maik Fröbe, Artur Jurk, and Martin Potthast. 2022. Clickbait Spoiling via Question Answering and Passage Retrieval. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7025–7036, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Clickbait Spoiling via Question Answering and Passage Retrieval (Hagen et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/2022.acl-long.484.pdf
- Code
- webis-de/acl-22
- Data
- MS MARCO, SQuAD, TriviaQA