Yunxiao Zhao


2026

Logical reasoning with large language models (LLMs) has made significant progress in recent years. However, existing methods still suffer from insufficient rule semantic grounding and weak rule application mechanisms, making it difficult to achieve precise understanding and effective utilization of rules in complex multi-step reasoning. To address this, we propose Leibniz, a theory-of-mind driven neuro-symbolic reasoning framework. Specifically, we construct a bidirectional reasoning model based on multi-agent collaboration, which characterizes the reasoning process from two complementary perspectives, namely the Evolution Agent and the Reduction Agent. The former transforms belief-unstable propositions into stable ones that are beneficial for proving the target conclusion. The latter performs reverse reduction from the target to resolve belief conflicts and distill new inferential insights. Both share a belief state space and achieve efficient collaborative reasoning through continual belief updating. We integrate natural language and symbolic representations throughout the reasoning process. The experimental results demonstrate that Leibniz surpasses existing state-of-the-art models in reasoning accuracy, and further analyses reveal its substantial advantages in reliability and flexibility.
Document-level Event Causality Identification (DECI) aims to identify causal relations among multiple events within unstructured text. Existing methods predominantly rely on local semantic similarity for independent event-pair discrimination, thereby overlooking the influence of the overall narrative backbone in the propagation of causal dependencies and the role differentiation of events within multi-cause/multi-effect structures. Therefore, we propose a suggest-verify-revise approach for document-level Event Causality Identification with narrative consistency (SVRECI). In the suggest stage, we integrate multi-dimensional heuristic causal suggestions generated by an LLM with structural suggestions derived from hypergraph modeling to provide multi-source initial support for candidate event pairs. In the verify stage, we introduce a Topological Hawkes process to perform constrained verification of narrative propagation consistency among events. In the revise stage, we construct a dynamically evolving document-level causal graph and incorporate a structure-aware dual-level contrastive learning mechanism at both the event and event-pair levels, iteratively reducing noisy edges over multiple iterations. Experimental results on EventStoryLine and Causal-TimeBank datasets demonstrate that our approach outperforms previous methods.

2025

Multi-hop question answering (MHQA) aims to utilize multi-source intensive documents retrieved to derive the answer. However, it is very challenging to model the importance of knowledge retrieved. Previous approaches primarily emphasize single-step and multi-step iterative decomposition or retrieval, which are susceptible to failure in long-chain reasoning due to the progressive accumulation of erroneous information. To address this problem, we propose a novel Local-tO-Global optimized retrieval method (LOG) to discover more beneficial information, facilitating the MHQA. In particular, we design a pointwise conditional v-information based local information modeling to cover usable documents with reasoning knowledge. We also improve tuplet objective loss, advancing multi-examples-aware global optimization to model the relationship between scattered documents. Extensive experimental results demonstrate our proposed method outperforms prior state-of-the-art models, and it can significantly improve multi-hop reasoning, notably for long-chain reasoning.

2024

Most existing rationalization approaches are susceptible to degeneration accumulation due to a lack of effective control over the learning direction of the model during training. To address this issue, we propose a novel approach AGR (Agent-Guided Rationalization), guiding the next action of the model based on its current training state. Specifically, we introduce causal intervention calculus to quantify the causal effects inherent during rationale training, and utilize reinforcement learning process to refine the learning bias of them. Furthermore, we pretrain an agent within this reinforced causal environment to guide the next step of the model. We theoretically demonstrate that a good model needs the desired guidance, and empirically show the effectiveness of our approach, outperforming existing state-of-the-art methods on BeerAdvocate and HotelReview datasets.

2021