Zitao Li
2026
Respecting Temporal-Causal Consistency: Entity-Event Knowledge Graph for Retrieval-Augmented Generation
Ze Yu Zhang | Zitao Li | Yaliang Li | Bolin Ding | Bryan Kian Hsiang Low
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Ze Yu Zhang | Zitao Li | Yaliang Li | Bolin Ding | Bryan Kian Hsiang Low
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieval-augmented generation (RAG) based on large language models often falters on narrative documents with inherent temporal structures. Standard unstructured RAG methods rely solely on embedding-similarity matching and lack any general mechanism to encode or exploit chronological information, while knowledge graph RAG (KG-RAG) frameworks collapse every mention of an entity into a single node, erasing the evolving context that drives many queries. To formalize this challenge and draw the community’s attention, we construct ChronoQA, a robust and discriminative QA benchmark that measures temporal, causal, and character consistency understanding in narrative documents (e.g., novels) under the RAG setting. We then introduce Entity-Event RAG (E 2RAG), a dual-graph framework that keeps separate entity and event subgraphs linked by a bipartite mapping, thereby preserving the temporal and causal facets needed for fine-grained reasoning. Across ChronoQA, our approach outperforms state-of-the-art unstructured and KG-based RAG baselines, with notable gains on causal and character consistency queries. E 2RAG therefore offers a practical path to more context-aware retrieval for tasks that require precise answers grounded in chronological information.
Retrieval Heads are Dynamic
Yuping Lin | Zitao Li | Yue Xing | Pengfei He | Yingqian Cui | Yaliang Li | Bolin Ding | Jingren Zhou | Jiliang Tang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Yuping Lin | Zitao Li | Yue Xing | Pengfei He | Yingqian Cui | Yaliang Li | Bolin Ding | Jingren Zhou | Jiliang Tang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Recent studies have identified "retrieval heads" in Large Language Models (LLMs) responsible for extracting information from input contexts. However, prior works largely rely on static statistics aggregated across datasets, identifying heads that perform retrieval on average. This perspective overlooks the fine-grained temporal dynamics of autoregressive generation. In this paper, we investigate retrieval heads from a dynamic perspective. Through extensive analysis, we establish three core claims: (1) Dynamism: Retrieval heads vary dynamically across timesteps; (2) Irreplaceability: Dynamic retrieval heads are specific at each timestep and cannot be effectively replaced by static retrieval heads; and (3) Correlation: The model’s hidden state encodes a predictive signal for future retrieval head patterns, indicating an internal planning mechanism. We validate these findings on the Needle-in-a-Haystack task and a multi-hop QA task, and quantify the differences on the utility of dynamic and static retrieval heads in a Dynamic Retrieval-Augmented Generation framework. Our study provides new insights into the internal mechanisms of LLMs.
2025
Advancing Reasoning with Off-the-Shelf LLMs: A Semantic Structure Perspective
Pengfei He | Zitao Li | Yue Xing | Yaliang Li | Jiliang Tang | Bolin Ding
Findings of the Association for Computational Linguistics: EMNLP 2025
Pengfei He | Zitao Li | Yue Xing | Yaliang Li | Jiliang Tang | Bolin Ding
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) have shown strong capabilities in zero-shot reasoning and generalization to new tasks. However, the zero-shot performance of general LLMs on complex tasks, such as multi-hop reasoning, remains suboptimal, while reasoning LLMs suffer from hallucinations and unfaithfulness. In this paper, to handle these limitations, we introduce a novel structure analysis method that helps LLMs better understand the question structure and guide the problem-solving process. We demonstrate that existing reasoning strategies, such as Chain-of-Thought and ReAct, significantly benefit from the LLM’s inherent understanding of semantic structure. We further ground our method in the theory of probabilistic graphical models to support its effectiveness. To enhance the reasoning process, we augment the structure analysis with refinement and retrieval capabilities, forming a multi-agent reasoning system called Structure-oriented Autonomous Reasoning Agents (SARA). Extensive experiments show that SARA significantly improves zero-shot performance on knowledge-intensive and mathematical tasks. Remarkably, our approach makes a general LLM competitive with dedicated reasoning models in several benchmarks and demonstrates strong robustness against corrupted reasoning paths.