VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability

Javier Aula-Blasco; Júlia Falcão; Susana Sotelo; Silvia Paniagua; Aitor González-Agirre; Marta Villegas

VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability

Javier Aula-Blasco, Júlia Falcão, Susana Sotelo, Silvia Paniagua, Aitor Gonzalez-Agirre, Marta Villegas

Abstract

As Large Language Models (LLMs) become available in a wider range of domains and applications, evaluating the truthfulness of multilingual LLMs is an issue of increasing relevance. TruthfulQA (Lin et al., 2022) is one of few benchmarks designed to evaluate how models imitate widespread falsehoods. However, it is strongly English-centric and starting to become outdated. We present VeritasQA, a context- and time-independent truthfulness benchmark built with multilingual transferability in mind, and available in Spanish, Catalan, Galician and English. VeritasQA comprises a set of 353 questions and answers inspired by common misconceptions and falsehoods that are not tied to any particular country or recent event. We release VeritasQA under an open license and present the evaluation results of 15 models of various architectures and sizes.

Anthology ID:: 2025.coling-main.366
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5463–5474
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-main.366/
DOI:
Bibkey:
Cite (ACL):: Javier Aula-Blasco, Júlia Falcão, Susana Sotelo, Silvia Paniagua, Aitor Gonzalez-Agirre, and Marta Villegas. 2025. VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability. In Proceedings of the 31st International Conference on Computational Linguistics, pages 5463–5474, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: VeritasQA: A Truthfulness Benchmark Aimed at Multilingual Transferability (Aula-Blasco et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2025.coling-main.366.pdf

PDF Cite Search Fix data