Clickbait Spoiling via Question Answering and Passage Retrieval

Matthias Hagen, Maik Fröbe, Artur Jurk, Martin Potthast


Abstract
We introduce and study the task of clickbait spoiling: generating a short text that satisfies the curiosity induced by a clickbait post. Clickbait links to a web page and advertises its contents by arousing curiosity instead of providing an informative summary. Our contributions are approaches to classify the type of spoiler needed (i.e., a phrase or a passage), and to generate appropriate spoilers. A large-scale evaluation and error analysis on a new corpus of 5,000 manually spoiled clickbait posts—the Webis Clickbait Spoiling Corpus 2022—shows that our spoiler type classifier achieves an accuracy of 80%, while the question answering model DeBERTa-large outperforms all others in generating spoilers for both types.
Anthology ID:
2022.acl-long.484
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7025–7036
Language:
URL:
https://aclanthology.org/2022.acl-long.484
DOI:
10.18653/v1/2022.acl-long.484
Bibkey:
Cite (ACL):
Matthias Hagen, Maik Fröbe, Artur Jurk, and Martin Potthast. 2022. Clickbait Spoiling via Question Answering and Passage Retrieval. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7025–7036, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Clickbait Spoiling via Question Answering and Passage Retrieval (Hagen et al., ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/nodalida-main-page/2022.acl-long.484.pdf
Software:
 2022.acl-long.484.software.zip
Video:
 https://preview.aclanthology.org/nodalida-main-page/2022.acl-long.484.mp4
Code
 webis-de/acl-22
Data
MS MARCOSQuADTriviaQA