C4: A Multilingual Benchmark for Retrieval-Augmented Generation Based on the Catechism of the Catholic Church and Its Compendium

Pius von Däniken, Mark Cieliebak, Jan Deriu


Abstract
We introduce a new multilingual case study for evaluating retrieval augmented generation (RAG) systems, based on the Catechism of the Catholic Church and its Compendium. The Catechism is a structured document with numbered paragraphs, officially translated into many languages under strict editorial alignment. The Compendium reformulates this material into a question-answer format with explicit citations to the corresponding paragraphs. Together, they form a set of parallel monolingual corpora that share identical semantic structure, enabling direct, controlled comparison of RAG performance across languages. Beyond its theological origin, this text pair closely mirrors real-world applications of RAG in institutional contexts, such as querying internal policy documents with associated FAQ-style summaries, making it a practical testbed for multilingual retrieval and grounded answer generation. We release our data collection scripts and baseline results for further research.
Anthology ID:
2026.lrec-main.590
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
7446–7456
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.590/
DOI:
Bibkey:
Cite (ACL):
Pius von Däniken, Mark Cieliebak, and Jan Deriu. 2026. C4: A Multilingual Benchmark for Retrieval-Augmented Generation Based on the Catechism of the Catholic Church and Its Compendium. International Conference on Language Resources and Evaluation, main:7446–7456.
Cite (Informal):
C4: A Multilingual Benchmark for Retrieval-Augmented Generation Based on the Catechism of the Catholic Church and Its Compendium (von Däniken et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.590.pdf