Zhengzhang Chen
2026
Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation
Minhua Lin | Zhengzhang Chen | Yanchi Liu | Xujiang Zhao | Zongyu Wu | Junxiang Wang | Xiang Zhang | Suhang Wang | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Minhua Lin | Zhengzhang Chen | Yanchi Liu | Xujiang Zhao | Zongyu Wu | Junxiang Wang | Xiang Zhang | Suhang Wang | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Time series data is ubiquitous across various domains, including manufacturing, finance, and healthcare. High-quality annotations are essential for effectively understanding time series and facilitating downstream tasks. However, obtaining such annotations is challenging, particularly in mission-critical domains. In this paper, we propose TESSA, a multi-agent system designed to automatically generate both general and domain-specific annotations for time series data. TESSA introduces two agents: a general annotation agent and a domain-specific annotation agent. The general agent captures common patterns and knowledge across multiple source domains, leveraging both time-series-wise and text-wise features to generate general annotations. Meanwhile, the domain-specific agent utilizes limited annotations from the target domain to learn domain-specific terminology and generate targeted annotations. Extensive experiments on multiple synthetic and real-world datasets demonstrate that TESSA effectively generates high-quality annotations, outperforming existing methods.
Multi-Agent Procedural Graph Extraction with Structural and Logical Refinement
Wangyang Ying | Yanchi Liu | Xujiang Zhao | Wei Cheng | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Wangyang Ying | Yanchi Liu | Xujiang Zhao | Wei Cheng | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Findings of the Association for Computational Linguistics: EACL 2026
Automatically extracting workflows as procedural graphs from natural language is a promising yet underexplored task that requires ensuring both structural validity and logical alignment. Recent advances in large language models (LLMs) show potential for graph extraction, but often yield ill-formed structures or misinterpret logical constructs such as gateways. We introduce , a multi-agent framework that treats procedural graph extraction as a multi-round reasoning process with structural and logical refinement agents. The framework operates in three iterative stages: (1) an LLM-based graph extraction phase, (2) a structural feedback phase where a simulation agent diagnoses and explains structural issues, and (3) a logical feedback phase where a semantic agent aligns semantics between flow logic and linguistic cues in the source text. Important feedback is prioritized and expressed in natural language, which is injected into the next-round prompt, enabling interpretable and controllable refinement. This modular design allows agents to target distinct error types without supervision or parameter updates. Experiments demonstrate that achieves substantial improvements in both structural correctness and logical consistency over strong baselines.
Mind the Gap in Cultural Alignment: Task-Aware Culture Management for Large Language Models
Binchi Zhang | Xujiang Zhao | Jundong Li | Haifeng Chen | Zhengzhang Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Binchi Zhang | Xujiang Zhao | Jundong Li | Haifeng Chen | Zhengzhang Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) are increasingly deployed in culturally sensitive real-world tasks. However, existing cultural alignment approaches fail to align LLMs’ broad cultural values with the specific goals of downstream tasks and suffer from cross-culture interference. We propose CultureManager, a novel pipeline for task-specific cultural alignment. CultureManager synthesizes task-aware cultural data in line with target task formats, grounded in culturally relevant web search results. To prevent conflicts between cultural norms, it manages multi-culture knowledge learned in separate adapters with a culture router that selects the appropriate one to apply. Experiments across five national cultures and ten culture-sensitive tasks show consistent improvements over prompt-based and fine-tuning baselines. Our results demonstrate the necessity of task adaptation and modular culture management for effective cultural alignment.
LOKA: Conflict-Aware LLM Knowledge Update with Adaptive Knowledge Memory
Binchi Zhang | Zhengzhang Chen | Zaiyi Zheng | Jundong Li | Haifeng Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Binchi Zhang | Zhengzhang Chen | Zaiyi Zheng | Jundong Li | Haifeng Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large Language Models (LLMs) have achieved remarkable success in natural language processing by encoding extensive knowledge, but their utility relies on timely updates as human knowledge keeps evolving. In this paper, we investigate the problem of LLM knowledge updates, which requires simultaneously unlearning unwanted information and learning new knowledge. Existing approaches that tackle unlearning and learning separately encounter *task conflicts* and *knowledge management issues* when applied to comprehensive knowledge updates.In this paper, we validate our findings with theoretical analysis and empirical evidence, and propose LOKA, a conflict-aware framework for Large language mOdel Knowledge updAtes. During training, LOKA introduces an adaptive knowledge memory approach in which updated knowledge is allocated across multiple memory units. During inference, LOKA retrieves the most relevant memory unit from the knowledge memory and integrates it with the original LLM to apply updated knowledge, while a learning-based router controls the activation of the knowledge memory to improve knowledge utilization. Extensive experiments demonstrate the efficacy of LOKA in achieving accurate, flexible, and conflict-aware knowledge updates.
Representation Interventions Enable Lifelong Knowledge Memory Control in LLMs
Xuyuan Liu | Shengyu Chen | Xinshuai Dong | Yanchi Liu | Xujiang Zhao | Haoyu Wang | Yujun Yan | Haifeng Chen | Zhengzhang Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xuyuan Liu | Shengyu Chen | Xinshuai Dong | Yanchi Liu | Xujiang Zhao | Haoyu Wang | Yujun Yan | Haifeng Chen | Zhengzhang Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Large language models (LLMs) often produce incorrect or outdated content. Updating their knowledge efficiently and accurately without costly retraining is a major challenge. This problem is particularly challenging for complex, unstructured knowledge in lifelong settings, where many edits must coexist without interference. We introduce **RILKE** (**R**epresentation **I**ntervention for **L**ifelong **K**nowledg**E** Control), a robust and scalable method that treats knowledge control as interventions within the model’s representation space. Leveraging representation-space expressiveness, we identify two key properties enabling RILKE to achieve fine-grained control over complex, unstructured knowledge while maintaining general utility with frozen base weights. During training, RILKE learns paraphrase-robust and edit-localized modules that limit each update to a low-dimensional subspace to minimize cross-edit interference. In inference, a query-adaptive router selects the appropriate module to guide the model’s generation. Across LLaMA and Qwen models, RILKE scales effectively to large-scale benchmarks, demonstrating high edit success and strong paraphrase generalization while preserving general utility with modest memory overhead. These results show RILKE is an effective and scalable solution for lifelong knowledge control in LLMs.
Uncertainty-Aware Test-Time Search for Optimization Problem Solving
Linlin Yu | Xujiang Zhao | Dong Li | Yanchi Liu | Wei Cheng | Zhengzhang Chen | Chen Zhao | Feng Chen | Haifeng Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Linlin Yu | Xujiang Zhao | Dong Li | Yanchi Liu | Wei Cheng | Zhengzhang Chen | Chen Zhao | Feng Chen | Haifeng Chen
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Automatically solving optimization problems from natural language descriptions with both efficiency and reliability is highly desirable but remains challenging. Language model hallucinations and the limited availability of labeled datasets often result in misaligned formulations, code errors, and feasibility failures We propose UMCTS, an Uncertainty-aware Monte Carlo Tree Search framework that combines the language understanding capability of large language models with the reliability of well-established solvers. UMCTS structures the solution process into four stages: global instruction, assumptions, mathematical formulation, and solver code generation. It employs Monte Carlo Tree Search with semantic-equivalence pruning, prior-guided exploration, and solver-based feasibility checks. An LLM judge provides numerical reward signals, qualitative error information, and uncertainty estimates. These signals are backpropagated to guide the search and flag unreliable outputs. Across six public benchmarks, UMCTS achieves state-of-the-art solution accuracy, improves efficiency by reducing token usage.
2025
MixLLM: Dynamic Routing in Mixed Large Language Models
Xinyuan Wang | Yanchi Liu | Wei Cheng | Xujiang Zhao | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Xinyuan Wang | Yanchi Liu | Wei Cheng | Xujiang Zhao | Zhengzhang Chen | Wenchao Yu | Yanjie Fu | Haifeng Chen
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large Language Models (LLMs) exhibit potential artificial generic intelligence recently, however, their usage is costly with high response latency. Given mixed LLMs with their own strengths and weaknesses, LLM routing aims to identify the most suitable model for each query in the stream to maximize response quality and minimize cost and latency. However, the challenges involve: (1) dynamic trade-offs among quality, cost, and latency; (2) enabling continual learning in deployed systems; and (3) navigating a varying (e.g., new LLM addition or old LLM removal) set of LLM candidates over time. To bridge these gaps, we develop MixLLM, a dynamic contextual-bandit-based routing system for query-LLM assignment. Specifically, we first leverage query tags to enhance query embeddings for the routing task. Next, we design lightweight prediction models to estimate the response qualities and costs of queries over LLMs. We then devise a meta-decision maker to choose the query-LLM assignments to best tradeoff response quality, cost, and latency. Finally, the system benefits from continual training, allowing it to adapt to evolving queries and user feedback over time. Our extensive experiments show that MixLLM achieves the best trade-offs in response quality, cost, and latency (97.25% of GPT-4’s quality at 24.18% of the cost under the time constraint).
Exploring Multi-Modal Data with Tool-Augmented LLM Agents for Precise Causal Discovery
ChengAo Shen | Zhengzhang Chen | Dongsheng Luo | Dongkuan Xu | Haifeng Chen | Jingchao Ni
Findings of the Association for Computational Linguistics: ACL 2025
ChengAo Shen | Zhengzhang Chen | Dongsheng Luo | Dongkuan Xu | Haifeng Chen | Jingchao Ni
Findings of the Association for Computational Linguistics: ACL 2025
Causal discovery is an imperative foundation for decision-making across domains, such as smart health, AI for drug discovery and AIOps. Traditional statistical causal discovery methods, while well-established, predominantly rely on observational data and often overlook the semantic cues inherent in cause-and-effect relationships. The advent of Large Language Models (LLMs) has ushered in an affordable way of leveraging the semantic cues for knowledge-driven causal discovery, but the development of LLMs for causal discovery lags behind other areas, particularly in the exploration of multi-modal data. To bridge the gap, we introduce MatMCD, a multi-agent system powered by tool-augmented LLMs. MatMCD has two key agents: a Data Augmentation agent that retrieves and processes modality-augmented data, and a Causal Constraint agent that integrates multi-modal data for knowledge-driven reasoning. The proposed design of the inner-workings ensures successful cooperation of the agents. Our empirical study across seven datasets suggests the significant potential of multi-modality enhanced causal discovery.
Search
Fix author
Co-authors
- Haifeng Chen 6
- Xujiang Zhao 6
- Yanchi Liu 5
- Wei Cheng 3
- Haifeng Chen 2
- Yanjie Fu 2
- Jundong Li 2
- Wenchao Yu 2
- Binchi Zhang 2
- Feng Chen 1
- Shengyu Chen 1
- Xinshuai Dong 1
- Dong Li 1
- Minhua Lin 1
- Xuyuan Liu 1
- Dongsheng Luo 1
- Jingchao Ni 1
- ChengAo Shen 1
- Haoyu Wang 1
- Junxiang Wang 1
- Suhang Wang 1
- Xinyuan Wang 1
- Zongyu Wu 1
- Dongkuan Xu 1
- Yujun Yan 1
- Wangyang Ying 1
- Linlin Yu 1
- Xiang Zhang 1
- Chen Zhao 1
- Zaiyi Zheng 1