Shreya Savant

2026

Mitigating Causal Bias in LLMs via Potential Outcomes Framework and Actual Causality Theory
Yiheng Zhao | Yuanliang Li | Shreya Savant | Jun Yan
Findings of the Association for Computational Linguistics: EACL 2026

Event Causality Identification (ECI) aims to identify causal relationships between events, which is essential for root cause analysis. While recent studies reveal that Large Language Models (LLMs) exhibit significant causal hallucination, a systematic evaluation of their document-level ECI performance across varied structural characteristics and a corresponding dataset is currently lacking. To fill this gap, we first construct a structure-controlled dataset to comprehensively assess their document-level ECI performance across texts with various structural characteristics that influence the causal behaviors in ECI. We find that different LLMs exhibit divergent causal bias across texts with varied structures, ranging from consistent hallucination or neglect to structure-dependent shifts between the two. To mitigate the bias, furthermore, we formulate ECI as a causal inference problem and propose a causality identification framework grounded in the potential outcomes and the Halpern–Pearl (HP) definition of actual causality theory. Experimental results demonstrate that our framework significantly reduces the causal bias associated with directly using LLMs on ECI, while also achieving superior performance.

Co-authors

Venues

Findings1

Fix author