Chunyu Yang
2026
COSMOS: Connectivity-Oriented Submodular Maximization for Optimal Subgraph Retrieval
Boci Peng | Xiao Liu | Boren Hu | Yun Zhu | Xuanbo Fan | Yanwei Yue | Chunyu Yang | Yan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Boci Peng | Xiao Liu | Boren Hu | Yun Zhu | Xuanbo Fan | Yanwei Yue | Chunyu Yang | Yan Zhang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Retrieving coherent evidence subgraphs is critical for Knowledge Base Question Answering (KBQA). Existing paradigms often treat facts independently, rely on biased heuristics, or employ myopic search, failing to optimize collective subgraph utility. In this paper, we propose **COSMOS** (**C**onnectivity-**O**riented **S**ubmodular **M**aximization for **O**ptimal **S**ubgraph Retrieval), a unified framework that formalizes evidence retrieval as a constrained submodular maximization problem. This formulation mathematically captures the trade-off between information relevance and structural complexity. To tractably solve this combinatorial challenge, COSMOS employs a decompose-and-conquer strategy, which first performs a seed-guided greedy expansion to maximize local semantic utility, followed by a topology-aware component aggregation to bridge disjoint evidence clusters via Maximum Spanning Tree aggregation. Guided by theoretical bounds, we introduce Structure-Aware Contrastive Tuning to align semantic space with KG topology. Experimental results on WebQSP, CWQ, and M3GQA benchmarks demonstrate that COSMOS achieves state-of-the-art performance.
Failures are Treasures: Constructing a Pedagogical Bridge for Agentic Strategy Distillation
Jiaxin Guo | Hao Sun | Wenhao Zhang | Chunyu Yang | Yan Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Jiaxin Guo | Hao Sun | Wenhao Zhang | Chunyu Yang | Yan Zhang
Findings of the Association for Computational Linguistics: ACL 2026
While Large Language Models (LLMs) excel in autonomous agent settings, small language models (SLMs) remain fragile, often collapsing after encountering errors. Traditional knowledge distillation focuses on imitating successful trajectories, while existing "learning from mistakes" methods treat errors as auxiliary signals rather than states requiring recoverable policies, leaving the dynamics of failure and recovery in agent settings largely unexplored. Inspired by Donald Schön’s theory of reflective practice, we propose P-BRIDGE (Pedagogical Bridge for Reflective Insight and Distillation of Guiding Errors). P-BRIDGE combines reflection-in-action with reflection-on-action, enabling agents to diagnose and correct critical errors during execution while abstracting transferable strategies from contrastive student–teacher trajectories. Experiments across eight benchmarks demonstrate that P-BRIDGE significantly elevates SLM performance—e.g., raising the 2WikiMultiHopQA accuracy of a 0.6B model from 6.2% to 34.2%.
2025
Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval
Hao Sun | Yingyan Hou | Jiayan Guo | Bo Wang | Chunyu Yang | Jinsong Ni | Yan Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Hao Sun | Yingyan Hou | Jiayan Guo | Bo Wang | Chunyu Yang | Jinsong Ni | Yan Zhang
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Document retrieval in real-world scenarios faces significant challenges due to diverse document formats and modalities. Traditional text-based approaches rely on tailored parsing techniques that disregard layout information and are prone to errors, while recent parsing-free visual methods often struggle to capture fine-grained textual semantics in text-rich scenarios. To address these limitations, we propose Unveil, a novel visual-textual embedding framework that effectively integrates textual and visual features for robust document representation. Through knowledge distillation, we transfer the semantic understanding capabilities from the visual-textual embedding model to a purely visual model, enabling efficient parsing-free retrieval while preserving semantic fidelity. Experimental results demonstrate that our visual-textual embedding method surpasses existing approaches, while knowledge distillation successfully bridges the performance gap between visual-textual and visual-only methods, improving both retrieval accuracy and efficiency.