Haifeng Li
2026
From Synthesis to Clinical Assistance: A Strategy-Aware Agent Framework for Autism Intervention based on Real Clinical Dataset
Junhong Lai | Shuzhong Lai | Yanhao Yu | Wanlin Chen | Chenyu Yan | Haifeng Li | Lin Yao | Yueming Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Junhong Lai | Shuzhong Lai | Yanhao Yu | Wanlin Chen | Chenyu Yan | Haifeng Li | Lin Yao | Yueming Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The development of AI-assisted Early Intensive Behavioral Intervention (EIBI) for Autism Spectrum Disorder (ASD) is severely constrained by data scarcity. Furthermore, while Applied Behavior Analysis (ABA) serves as the gold standard for clinical intervention, general-purpose Large Language Models (LLMs) struggle to strictly adhere to its standardized procedures, often resulting in interactions that are linguistically fluent but strategically inconsistent. To address these challenges, we introduce ASDAgent, a strategy-aware framework designed to unify high-fidelity intervention dialogue synthesis and clinical decision support. ASDAgent incorporates two specialized components to solve distinct problems: (i) a DoctorAgent equipped with an Observe-Think-Act-Correct (O-T-A-C) reasoning loop, which resolves the issue of strategy collapse in LLMs by making ABA execution explicit and controllable; and (ii) a ChildAgent that utilizes probabilistic behavior modeling to mitigate data homogeneity, simulating diverse and non-deterministic ASD response patterns. Experiments demonstrate that dialogues generated by ASDAgent closely mirror the strategy distribution of human therapists (KL divergence: 0.083). In real autism intervention, ASDAgent achieves nearly 80% strategic consistency with human experts. Moreover, we show that synthetic data produced by ASDAgent effectively distills professional clinical knowledge into small language models (SLMs), significantly enhancing their therapeutic capabilities.
Towards Self-Evolving Agents: Enabling Autonomy through Interactive Experience Refinement
Cheng Yang | Xuemeng Yang | Licheng Wen | Daocheng Fu | Jianbiao Mei | Rong Wu | Pinlong Cai | Yufan Shen | Nianchen Deng | Jia Xu | Botian Shi | Yu Qiao | Haifeng Li
Findings of the Association for Computational Linguistics: ACL 2026
Cheng Yang | Xuemeng Yang | Licheng Wen | Daocheng Fu | Jianbiao Mei | Rong Wu | Pinlong Cai | Yufan Shen | Nianchen Deng | Jia Xu | Botian Shi | Yu Qiao | Haifeng Li
Findings of the Association for Computational Linguistics: ACL 2026
Large Language Models often struggle with complex, multi-step operational tasks because they remain static during inference and cannot learn from past experience. To address this, we propose MUSE, a framework that enables iterative self-improvement through a hierarchical Memory Module. MUSE organizes cross-domain insights to facilitate the orchestration of long-horizon workflows. The core of our approach is an autonomous post-execution critique mechanism: after completing each sub-task, the system analyzes its operational logs and distills raw execution data into structured, reusable knowledge. This allows the agent to evolve dynamically rather than relying on fixed parameters. Evaluated on the rigorous TAC productivity benchmark, MUSE achieves new state-of-the-art results, significantly outperforming previous methods using only the streamlined Gemini-2.5 Flash model. Our analysis demonstrates that MUSE’s performance scales with the accumulation of insights and exhibits strong cross-task transferability, marking a key step toward autonomous systems capable of lifelong learning in professional environments. Demo videos can be found in our supplementary materials.
Graph-Based Chain-of-Thought Pruning for Reducing Redundant Reflections in Reasoning LLMs
Hongyuan Yuan | Xinran He | Run Shao | Bolei He | Xianwei Xue | Mengke Chen | Qiutong Pan | Haiwei Wang | Haifeng Li
Findings of the Association for Computational Linguistics: ACL 2026
Hongyuan Yuan | Xinran He | Run Shao | Bolei He | Xianwei Xue | Mengke Chen | Qiutong Pan | Haiwei Wang | Haifeng Li
Findings of the Association for Computational Linguistics: ACL 2026
Extending CoT through RL has been widely used to enhance the reasoning capabilities of LLMs. However, due to the sparsity of reward signals, it can also induce undesirable thinking patterns such as overthinking, i.e., generating redundant intermediate reasoning content. In this work, we argue that a major source of such redundancy is inefficient reflection, which often manifests in two problematic patterns: Indiscriminate Reflection, where the model performs broad, low-impact checks throughout reasoning, and Repetitive Reflection, where it repeatedly re-verifies an already established conclusion. To address this, we introduce a graph-based CoT optimization framework. Specifically, we convert each linear CoT into a directed acyclic graph (DAG) with explicit dependency edges, and design a dual pruning strategy: branch-level pruning removes weakly contributing reflection branches, while depth-level pruning eliminates late-stage re-verification. We distill this behavior via a three-stage pipeline: (1) SFT to initialize the policy on pruned concise traces, (2) DPO to prefer correct but less redundant trajectories, and (3) GRPO with length penalty to jointly optimize answer correctness and efficiency. Experiments show that our approach reduces the average reasoning tokens by 42% while maintaining or improving accuracy.
2025
ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention Dataset
Shuzhong Lai | Chenxi Li | Junhong Lai | Yucun Zhong | Chenyu Yan | Xiang Li | Haifeng Li | Gang Pan | Lin Yao | Yueming Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Shuzhong Lai | Chenxi Li | Junhong Lai | Yucun Zhong | Chenyu Yan | Xiang Li | Haifeng Li | Gang Pan | Lin Yao | Yueming Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Currently, leveraging large language models (LLMs) for autism intervention is a significant yet challenging task, particularly when directly employing LLMs as an intervention doctor. Researchers have mainly focused on using prompt engineering for role play as an intervention doctor and integrating auxiliary elements such as visual stimuli to enhance the sensory experience of the intervention, while neglecting the challenge that LLMs’ inherent dialogue style and intervention strategies do not meet the requirements of clinical dialogue interventions. To fill the gap, we propose a comprehensive framework for training LLMs to conduct dialogue interventions in accordance with the principles of Applied Behavior Analysis (ABA) which is commonly used by clinicians. Specifically, we collected clinical recordings of dialogue interventions for autistic children and constructed the topic dialogue dataset ASD-iLLM-8k. By incorporating the system prompt based on the ABA and ASD-iLLM-8k dataset, we fine-tuned LLMs to develop ASD-iLLM. We also proposed a role-play strategy in which LLMs act as autistic children to comprehensively evaluate the doctor model’s capabilities at the dialogue level. Extensive experiments indicate that ASD-iLLM outperforms existing models in both automatic and human evaluation, with intervention strategies and dialogue style more closely resembling those of clinical intervention doctors. Our dataset, model, and code are available on https://github.com/Shuzhong-Lai/ASD-iLLM.
Select to Know: An Internal-External Knowledge Self-Selection Framework for Domain-Specific Question Answering
Bolei He | Xinran He | Run Shao | Shanfu Shu | Xianwei Xue | MingQuan Cheng | Haifeng Li | Zhen-Hua Ling
Findings of the Association for Computational Linguistics: EMNLP 2025
Bolei He | Xinran He | Run Shao | Shanfu Shu | Xianwei Xue | MingQuan Cheng | Haifeng Li | Zhen-Hua Ling
Findings of the Association for Computational Linguistics: EMNLP 2025
Large Language Models (LLMs) perform well in general QA but often struggle in domain-specific scenarios. Retrieval-Augmented Generation (RAG) introduces external knowledge but suffers from hallucinations and latency due to noisy retrievals. Continued pretraining internalizes domain knowledge but is costly and lacks cross-domain flexibility. We attribute this challenge to the long-tail distribution of domain knowledge, which leaves partial yet useful internal knowledge underutilized. We further argue that knowledge acquisition should be progressive, mirroring human learning: first understanding concepts, then applying them to complex reasoning. To address this, we propose Selct2Know (S2K), a cost-effective framework that internalizes domain knowledge through an internal-external knowledge self-selection strategy and selective supervised fine-tuning. We also introduce a structured reasoning data generation pipeline and integrate GRPO to enhance reasoning ability. Experiments on medical, legal, and financial QA benchmarks show that S2K consistently outperforms existing methods and matches domain-pretrained LLMs with significantly lower cost.
Search
Fix author
Co-authors
- Xinran He 2
- Bolei He 2
- Shuzhong Lai 2
- Junhong Lai 2
- Run Shao 2
- Yueming Wang 2
- Xianwei Xue 2
- Chenyu Yan 2
- Pinlong Cai 1
- Wanlin Chen 1
- Mengke Chen 1
- MingQuan Cheng 1
- Nianchen Deng 1
- Daocheng Fu 1
- Chenxi Li 1
- Xiang Li 1
- Zhen-Hua Ling 1
- Jianbiao Mei 1
- Gang Pan 1
- Qiutong Pan 1
- Yu Qiao 1
- Yufan Shen 1
- Botian Shi 1
- Shanfu Shu 1
- Haiwei Wang 1
- Licheng Wen 1
- Rong Wu 1
- Jia Xu 1
- Cheng Yang 1
- Xuemeng Yang 1
- Lin Yao 1
- Lin Yao 1
- Yanhao Yu 1
- Hongyuan Yuan 1
- Yucun Zhong 1