Constructing a Dataset for Hallucination Detection in Japanese Summarization with Fine-grained Faithfulness Labels

Hikari Tanaka; Atsushi Keyaki; Mamoru Komachi

Constructing a Dataset for Hallucination Detection in Japanese Summarization with Fine-grained Faithfulness Labels

Hikari Tanaka, Atsushi Keyaki, Mamoru Komachi

Abstract

Large language models (LLMs) can generate fluent text, but the quality of generated content crucially depends on its consistency with the given input.This aspect is commonly referred to as faithfulness, which concerns whether the output is properly grounded in the input context.A major challenge related to faithfulness is that generated content may include information not supported by the input or may contradict it.This phenomenon is often referred to as hallucination, and increasing attention has been paid to automatic hallucination detection, which determines whether an LLM’s output is hallucinated.To evaluate the performance of hallucination detection systems, researchers use evaluation datasets with labels indicating the presence or absence of hallucinations.While such datasets have been developed for English and Chinese, Japanese evaluation resources for hallucination detection remain limited.Therefore, we constructed a Japanese evaluation dataset for hallucination detection in summarization by manually annotating sentence-level faithfulness labels in LLM-generated summaries of Japanese documents.We annotate 390 summaries (1,938 sentences) generated by three LLMs with sentence-level multi-label annotations for faithfulness with respect to the input document.The taxonomy extends a prior classification scheme and captures distinct patterns of model errors, enabling both binary hallucination detection and fine-grained error-type analysis of Japanese LLM summarization.

Anthology ID:: 2026.eacl-srw.15
Volume:: Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: March
Year:: 2026
Address:: Rabat, Morocco
Editors:: Selene Baez Santamaria, Sai Ashish Somayajula, Atsuki Yamaguchi
Venue:: EACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 207–218
Language:
URL:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.15/
DOI:
Bibkey:
Cite (ACL):: Hikari Tanaka, Atsushi Keyaki, and Mamoru Komachi. 2026. Constructing a Dataset for Hallucination Detection in Japanese Summarization with Fine-grained Faithfulness Labels. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 207–218, Rabat, Morocco. Association for Computational Linguistics.
Cite (Informal):: Constructing a Dataset for Hallucination Detection in Japanese Summarization with Fine-grained Faithfulness Labels (Tanaka et al., EACL 2026)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-eacl/2026.eacl-srw.15.pdf

PDF Cite Search Fix data