BanglaClickBERT: Bangla Clickbait Detection from News Headlines using Domain Adaptive BanglaBERT and MLP Techniques

Saman Sarker Joy, Tanusree Das Aishi, Naima Tahsin Nodi, Annajiat Alim Rasel


Abstract
News headlines or titles that deliberately persuade readers to view a particular online content are referred to as clickbait. There have been numerous studies focused on clickbait detection in English language, compared to that, there have been very few researches carried out that address clickbait detection in Bangla news headlines. In this study, we have experimented with several distinctive transformers models, namely BanglaBERT and XLM-RoBERTa. Additionally, we introduced a domain-adaptive pretrained model, BanglaClickBERT. We conducted a series of experiments to identify the most effective model. The dataset we used for this study contained 15,056 labeled and 65,406 unlabeled news headlines; in addition to that, we have collected more unlabeled Bangla news headlines by scraping clickbait-dense websites making a total of 1 million unlabeled news headlines in order to make our BanglaClickBERT. Our approach has successfully surpassed the performance of existing state-of-the-art technologies providing a more accurate and efficient solution for detecting clickbait in Bangla news headlines, with potential implications for improving online content quality and user experience.
Anthology ID:
2023.alta-1.1
Volume:
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association
Month:
November
Year:
2023
Address:
Melbourne, Australia
Editors:
Smaranda Muresan, Vivian Chen, Kennington Casey, Vandyke David, Dethlefs Nina, Inoue Koji, Ekstedt Erik, Ultes Stefan
Venue:
ALTA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–10
Language:
URL:
https://aclanthology.org/2023.alta-1.1
DOI:
Bibkey:
Cite (ACL):
Saman Sarker Joy, Tanusree Das Aishi, Naima Tahsin Nodi, and Annajiat Alim Rasel. 2023. BanglaClickBERT: Bangla Clickbait Detection from News Headlines using Domain Adaptive BanglaBERT and MLP Techniques. In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association, pages 1–10, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
BanglaClickBERT: Bangla Clickbait Detection from News Headlines using Domain Adaptive BanglaBERT and MLP Techniques (Joy et al., ALTA 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2023.alta-1.1.pdf