Kaijie Zhu
2025
Disentangling Logic: The Role of Context in Large Language Model Reasoning Capabilities
Wenyue Hua
|
Kaijie Zhu
|
Lingyao Li
|
Lizhou Fan
|
Mingyu Jin
|
Shuhang Lin
|
Haochen Xue
|
Zelong Li
|
Jindong Wang
|
Yongfeng Zhang
Findings of the Association for Computational Linguistics: ACL 2025
This study intends to systematically disentangle pure logic reasoning and text understanding by investigating the contrast across abstract and contextualized logical problems from a comprehensive set of domains. We explore whether LLMs demonstrate genuine reasoning capabilities across various domains when the underlying logical structure remains constant. We focus on two main questions (1) Can abstract logical problems alone accurately benchmark LLMs’ reasoning ability in real-world scenarios, disentangled from contextual support in practical settings? (2) Does fine-tuning LLMs on abstract logic problems generalize to contextualized logic problems and vice versa? To investigate these questions, we focus on standard propositional logic, specifically propositional deductive and abductive logic reasoning. We construct datasets for both reasoning types with four difficulty levels across 12 distinct domains based on the Wikipedia categorization in addition to those with purely abstract variables. Our experiments aim to provide insights into disentangling context in logical reasoning, the genuine reasoning capabilities of LLMs, and their generalization potential. Coda and data are available at https://anonymous.4open.science/r/ContextHub-957E.
2024
AgentReview: Exploring Peer Review Dynamics with LLM Agents
Yiqiao Jin
|
Qinlin Zhao
|
Yiyang Wang
|
Hao Chen
|
Kaijie Zhu
|
Yijia Xiao
|
Jindong Wang
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Peer review is fundamental to the integrity and advancement of scientific publication. Traditional methods of peer review analyses often rely on exploration and statistics of existing peer review data, which do not adequately address the multivariate nature of the process, account for the latent variables, and are further constrained by privacy concerns due to the sensitive nature of the data. We introduce AgentReview, the first large language model (LLM) based peer review simulation framework, which effectively disentangles the impacts of multiple latent factors and addresses the privacy issue. Our study reveals significant insights, including a notable 37.1% variation in paper decisions due to reviewers’ biases, supported by sociological theories such as the social influence theory, altruism fatigue, and authority bias. We believe that this study could offer valuable insights to improve the design of peer review mechanisms.
Search
Fix author
Co-authors
- Jindong Wang 2
- Hao Chen (陈昊) 1
- Lizhou Fan 1
- Wenyue Hua 1
- Yiqiao Jin 1
- show all...