Jun Yan

Other people with similar names: Jun Yan (Tsinghua, USC, Google)

Unverified author pages with similar names: Jun Yan


2026

Although large language models (LLMs) excel in complex reasoning tasks, they suffer from severe causal hallucination in event causality identification (ECI), particularly in smaller models (1.5B parameters). A promising approach to address this issue is to fine-tune them with Chain-of-Thought (CoT) traces. However, there is currently a lack of CoT trace dataset available for ECI. In this paper, we first investigate the essential criteria that effective CoT traces should possess to mitigate causal hallucination in smaller models. We then design a pipeline to generate CoT traces that meet these criteria. Moreover, since there is currently no metric for quantifying causal hallucination, we also introduce a new metric, the Causal Hallucination Rate (CHR), to quantify causal hallucination, guide the formulation of effective CoT trace criteria, and validate the effectiveness of our pipeline. Our experiments show that fine-tuning with the CoT traces generated by our pipeline not only substantially reduces causal hallucination in smaller LLMs but also improves mean accuracy. Moreover, the fine-tuned models exhibit strong cross-dataset and cross-difficulty generalization, as well as robustness under intentionally misleading intervention prompts.
Event Causality Identification (ECI) aims to identify causal relationships between events, which is essential for root cause analysis. While recent studies reveal that Large Language Models (LLMs) exhibit significant causal hallucination, a systematic evaluation of their document-level ECI performance across varied structural characteristics and a corresponding dataset is currently lacking. To fill this gap, we first construct a structure-controlled dataset to comprehensively assess their document-level ECI performance across texts with various structural characteristics that influence the causal behaviors in ECI. We find that different LLMs exhibit divergent causal bias across texts with varied structures, ranging from consistent hallucination or neglect to structure-dependent shifts between the two. To mitigate the bias, furthermore, we formulate ECI as a causal inference problem and propose a causality identification framework grounded in the potential outcomes and the Halpern–Pearl (HP) definition of actual causality theory. Experimental results demonstrate that our framework significantly reduces the causal bias associated with directly using LLMs on ECI, while also achieving superior performance.