FactCorp: A Corpus of Dutch Fact-checks and its Multiple Usages

Marten van der Meulen; W. Gudrun Reijnierse

FactCorp: A Corpus of Dutch Fact-checks and its Multiple Usages

Marten van der Meulen, W. Gudrun Reijnierse

Abstract

Fact-checking information before publication has long been a core task for journalists, but recent times have seen the emergence of dedicated news items specifically aimed at fact-checking after publication. This relatively new form of fact-checking receives a fair amount of attention from academics, with current research focusing mostly on journalists’ motivations for publishing post-hoc fact-checks, the effects of fact-checking on the perceived accuracy of false claims, and the creation of computational tools for automatic fact-checking. In this paper, we propose to study fact-checks from a corpus linguistic perspective. This will enable us to gain insight in the scope and contents of fact-checks, to investigate what fact-checks can teach us about the way in which science appears (incorrectly) in the news, and to see how fact-checks behave in the science communication landscape. We report on the creation of FactCorp, a 1,16 million-word corpus containing 1,974 fact-checks from three major Dutch newspapers. We also present results of several exploratory analyses, including a rhetorical moves analysis, a qualitative content elements analysis, and keyword analyses. Through these analyses, we aim to demonstrate the wealth of possible applications that FactCorp allows, thereby stressing the importance of creating such resources.

Anthology ID:: 2020.lrec-1.161
Volume:: Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:: May
Year:: 2020
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 1286–1292
Language:: English
URL:: https://aclanthology.org/2020.lrec-1.161
DOI:
Bibkey:
Cite (ACL):: Marten van der Meulen and W. Gudrun Reijnierse. 2020. FactCorp: A Corpus of Dutch Fact-checks and its Multiple Usages. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1286–1292, Marseille, France. European Language Resources Association.
Cite (Informal):: FactCorp: A Corpus of Dutch Fact-checks and its Multiple Usages (van der Meulen & Reijnierse, LREC 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-acl-2023-videos/2020.lrec-1.161.pdf

PDF Search