Liu He
2026
From Behavior to Geometry: A Causal and Geometric Analysis of LoRA-Based Domain Adaptation
Yizhe WANG | Liu He | Zhenhua Ling
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Yizhe WANG | Liu He | Zhenhua Ling
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Parameter-efficient fine-tuning with Low-Rank Adaptation (LoRA) often improves a large language model’s in-domain performance at the cost of cross-domain generalization. We investigate the mechanistic basis for this trade-off, asking whether LoRA creates new discriminative directions in representation space (emergence) or merely reshapes pre-existing ones. Using a Word Sense Disambiguation testbed, we couple controlled behavioral evaluation with causal localization and geometric diagnostics. We find LoRA learns new, spatially localized discriminative directions in the middle layers of the network, focused at token positions critical for the task. This "subspace extension" account explains why LoRA-tuned models excel on in-domain data but struggle to transfer. As a proof of concept, we introduce a mechanistically informed LoRA configuration that concentrates capacity in the identified layers, promotes rank diversity, and applies light answer-token calibration. Without increasing training budget, it yields consistent improvements in both in- and cross-domain settings, demonstrating that mechanistic insight can guide more efficient adaptation.
2025
DocAgent: An Agentic Framework for Multi-Modal Long-Context Document Understanding
Li Sun | Liu He | Shuyue Jia | Yangfan He | Chenyu You
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Li Sun | Liu He | Shuyue Jia | Yangfan He | Chenyu You
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Recent advances in large language models (LLMs) have demonstrated significant promise in document understanding and question-answering. Despite the progress, existing approaches can only process short documents due to limited context length or fail to fully leverage multi-modal information. In this work, we introduce DocAgent, a multi-agent framework for long-context document understanding that imitates human reading practice. Specifically, we first extract a structured, tree-formatted outline from documents to help agents identify relevant sections efficiently. Further, we develop an interactive reading interface that enables agents to query and retrieve various types of content dynamically. To ensure answer reliability, we introduce a reviewer agent that cross-checks responses using complementary sources and maintains a task-agnostic memory bank to facilitate knowledge sharing across tasks. We evaluate our method on two long-context document understanding benchmarks, where it bridges the gap to human-level performance by surpassing competitive baselines, while maintaining a short context length. Our code is available at https://github.com/lisun-ai/DocAgent.