CausalEval: Towards Better Causal Reasoning in Language Models

Longxuan Yu; Delin Chen; Siheng Xiong; Qingyang Wu; Dawei Li; Zhikai Chen; Xiaoze Liu; Liangming Pan

CausalEval: Towards Better Causal Reasoning in Language Models

Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan

Abstract

Causal reasoning (CR) is a crucial aspect of intelligence, essential for problem-solving, decision-making, and understanding the world. While language models (LMs) can generate rationales for their outputs, their ability to reliably perform causal reasoning remains uncertain, often falling short in tasks requiring a deep understanding of causality. In this paper, we introduce CausalEval, a comprehensive review of research aimed at enhancing LMs for causal reasoning, coupled with an empirical evaluation of current models and methods. We categorize existing methods based on the role of LMs: either as reasoning engines or as helpers providing knowledge or data to traditional CR methods, followed by a detailed discussion of methodologies in each category. We then assess the performance of current LMs and various enhancement methods on a range of causal reasoning tasks, providing key findings and in-depth analysis. Finally, we present insights from current studies and highlight promising directions for future research. We aim for this work to serve as a comprehensive resource, fostering further advancements in causal reasoning with LMs.

Anthology ID:: 2025.naacl-long.622
Volume:: Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:: April
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 12512–12540
Language:
URL:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-long.622/
DOI:
Bibkey:
Cite (ACL):: Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Dawei Li, Zhikai Chen, Xiaoze Liu, and Liangming Pan. 2025. CausalEval: Towards Better Causal Reasoning in Language Models. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 12512–12540, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: CausalEval: Towards Better Causal Reasoning in Language Models (Yu et al., NAACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/Ingest-2025-COMPUTEL/2025.naacl-long.622.pdf

PDF Cite Search Fix data