Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events

Shengzhe Li, Masaya Ohagi, Hayato Tsukagoshi, Akihiko Fukuchi, Tomohide Shibata, Daisuke Kawahara


Abstract
Retrieval-augmented generation (RAG) is a technique in which a large language model (LLM) generates answers based on relevant documents retrieved from an external document collection. Existing RAG evaluation benchmarks often use public data, such as Wikipedia and news articles, as the external document collection. However, these data are highly likely to be already included in the LLM’s pre-training corpus, which may prevent an accurate evaluation of the model’s ability to generate answers based on the retrieved documents. In this study, we construct a Japanese RAG benchmark by having an LLM synthesize documents about non-existent entities and events and use this collection of synthetic documents as the search target. Since these synthetic documents are not included in the LLM’s training data, the ability to generate answers based on retrieved documents can be evaluated more accurately. In addition to the synthetic documents, the benchmark is composed of questions and correct answers, which are created using a combination of LLMs and human effort. We then evaluated and analyzed the RAG performance of existing LLMs using the constructed benchmark.
Anthology ID:
2026.lrec-main.589
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
7435–7445
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.589/
DOI:
Bibkey:
Cite (ACL):
Shengzhe Li, Masaya Ohagi, Hayato Tsukagoshi, Akihiko Fukuchi, Tomohide Shibata, and Daisuke Kawahara. 2026. Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events. International Conference on Language Resources and Evaluation, main:7435–7445.
Cite (Informal):
Construction of a Japanese RAG Benchmark Using Synthetic Documents on Non-existent Entities and Events (Li et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.589.pdf