Václav Moravec

Also published as: Vaclav Moravec


2023

pdf
Czech-ing the News: Article Trustworthiness Dataset for Czech
Matyas Bohacek | Michal Bravansky | Filip Trhlík | Vaclav Moravec
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

We present the Verifee dataset: a multimodal dataset of news articles with fine-grained trustworthiness annotations. We bring a diverse set of researchers from social, media, and computer sciences aboard to study this interdisciplinary problem holistically and develop a detailed methodology that assesses the texts through the lens of editorial transparency, journalist conventions, and objective reporting while penalizing manipulative techniques. We collect over 10,000 annotated articles from 60 Czech online news sources. Each item is categorized into one of the 4 proposed classes on the credibility spectrum – ranging from entirely trustworthy articles to deceptive ones – and annotated of its manipulative attributes. We fine-tune prominent sequence-to-sequence language models for the trustworthiness classification task on our dataset and report the best F-1 score of 0.53. We open-source the dataset, annotation methodology, and annotators’ instructions in full length at https://www.verifee.ai/research/ to enable easy build-up work.

2022

pdf
Annotating Attribution in Czech News Server Articles
Barbora Hladka | Jiří Mírovský | Matyáš Kopp | Václav Moravec
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper focuses on detection of sources in the Czech articles published on a news server of Czech public radio. In particular, we search for attribution in sentences and we recognize attributed sources and their sentence context (signals). We organized a crowdsourcing annotation task that resulted in a data set of 2,167 stories with manually recognized signals and sources. In addition, the sources were classified into the classes of named and unnamed sources.