Abstract
Creating classifiers of disinformation is time-consuming, expensive, and requires vast effort from experts spanning different fields. Even when these efforts succeed, their roll-out to publicly available applications stagnates. While these models struggle to find their consumer-accessible use, disinformation behavior online evolves at a pressing speed. The hoaxes get shared in various abbreviations on social networks, often in user-restricted areas, making external monitoring and intervention virtually impossible. To re-purpose existing NLP methods for the new paradigm of sharing misinformation, we propose leveraging information about given texts’ originating news sources to proxy the respective text’s trustworthiness. We first present a methodology for determining the sources’ overall credibility. We demonstrate our pipeline construction in a specific language and introduce CNSC: a novel dataset for Czech articles’ news source and source credibility classification. We constitute initial benchmarks on multiple architectures. Lastly, we create in-the-wild wrapper applications of the trained models: a chatbot, a browser extension, and a standalone web application.- Anthology ID:
- 2022.nlp4pi-1.10
- Volume:
- Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI)
- Month:
- December
- Year:
- 2022
- Address:
- Abu Dhabi, United Arab Emirates (Hybrid)
- Editors:
- Laura Biester, Dorottya Demszky, Zhijing Jin, Mrinmaya Sachan, Joel Tetreault, Steven Wilson, Lu Xiao, Jieyu Zhao
- Venue:
- NLP4PI
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 79–88
- Language:
- URL:
- https://aclanthology.org/2022.nlp4pi-1.10
- DOI:
- 10.18653/v1/2022.nlp4pi-1.10
- Cite (ACL):
- Matyas Bohacek. 2022. Misinformation Detection in the Wild: News Source Classification as a Proxy for Non-article Texts. In Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI), pages 79–88, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
- Cite (Informal):
- Misinformation Detection in the Wild: News Source Classification as a Proxy for Non-article Texts (Bohacek, NLP4PI 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2022.nlp4pi-1.10.pdf