Shao Zhang
2026
LLMs for Now, Fine-Tuning for Later: An Ensemble Approach to Data Drift in Domain-Specific Tasks
Yuxuan Lu | Bingsheng Yao | Shao Zhang | Yisi Sang | Yun Wang | Hansu Gu | Peng Zhang | Tun Lu | Toby Jia-Jun Li | Dakuo Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Yuxuan Lu | Bingsheng Yao | Shao Zhang | Yisi Sang | Yun Wang | Hansu Gu | Peng Zhang | Tun Lu | Toby Jia-Jun Li | Dakuo Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Deploying machine learning models in real-world domain-specific scenarios is challenged by the scarcity of expert annotations and by data drift, where the statistical properties of incoming data continuously evolve. Active Learning (AL) iteratively improves compact models with expert annotations but suffers from recurring cold-start degradation, while LLMs provide strong off-the-shelf performance yet cannot leverage newly accumulated labels, raising the question: how can we better leverage LLMs to assist the active learning process? Through an empirical study on five legal and biomedical datasets, we reveal a complementary temporal dynamic: LLMs excel during early and post-drift stages, while AL-assisted compact models eventually surpass them as annotations accumulate. Motivated by this finding, we propose an ensemble system that combines an LLM, an AL-assisted compact model, and an automatic switch module that routes predictions to the better-performing model in real time. Evaluated under simulated data drift on two mental health datasets, our system achieves 96–98% switch accuracy and consistently outperforms either model used alone.
2025
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration
Shao Zhang | Xihuai Wang | Wenhao Zhang | Chaoran Li | Junru Song | Tingyu Li | Lin Qiu | Xuezhi Cao | Xunliang Cai | Wen Yao | Weinan Zhang | Xinbing Wang | Ying Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shao Zhang | Xihuai Wang | Wenhao Zhang | Chaoran Li | Junru Song | Tingyu Li | Lin Qiu | Xuezhi Cao | Xunliang Cai | Wen Yao | Weinan Zhang | Xinbing Wang | Ying Wen
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Agents built on large language models (LLMs) have excelled in turn-by-turn human-AI collaboration but struggle with simultaneous tasks requiring real-time interaction. Latency issues and the challenge of inferring variable human strategies hinder their ability to make autonomous decisions without explicit instructions. Through experiments with current independent *System 1* and *System 2* methods, we validate the necessity of using Dual Process Theory (DPT) in real-time tasks. We propose DPT-Agent, a novel language agent framework that integrates *System 1* and *System 2* for efficient real-time simultaneous human-AI collaboration. DPT-Agent’s *System 1* uses a Finite-state Machine (FSM) and code-as-policy for fast, intuitive, and controllable decision-making. DPT-Agent’s *System 2* integrates Theory of Mind (ToM) and asynchronous reflection to infer human intentions and perform reasoning-based autonomous decisions. We demonstrate the effectiveness of DPT-Agent through further experiments with rule-based agents and human collaborators, showing significant improvements over mainstream LLM-based frameworks. To the best of our knowledge, DPT-Agent is the first language agent framework that achieves successful real-time simultaneous human-AI collaboration autonomously. Code of DPT-Agent can be found in https://github.com/sjtu-marl/DPT-Agent.
2024
More Samples or More Prompts? Exploring Effective Few-Shot In-Context Learning for LLMs with In-Context Sampling
Bingsheng Yao | Guiming Chen | Ruishi Zou | Yuxuan Lu | Jiachen Li | Shao Zhang | Yisi Sang | Sijia Liu | James Hendler | Dakuo Wang
Findings of the Association for Computational Linguistics: NAACL 2024
Bingsheng Yao | Guiming Chen | Ruishi Zou | Yuxuan Lu | Jiachen Li | Shao Zhang | Yisi Sang | Sijia Liu | James Hendler | Dakuo Wang
Findings of the Association for Computational Linguistics: NAACL 2024
While most existing works on LLM prompting techniques focus only on how to select a better set of data samples inside one single prompt input (In-Context Learning or ICL), why can not we design and leverage multiple prompts together to further improve the LLM’s performance? In this work, we propose In-Context Sampling (ICS), a low-resource LLM prompting technique to produce confident predictions by optimizing the construction of multiple ICL prompt inputs. Extensive experiments with three open-source LLMs (FlanT5-XL, Mistral-7B, and Mixtral-8x7B) on four NLI datasets (e-SNLI, Multi-NLI, ANLI, and Contract-NLI) and one QA dataset (CommonsenseQA) illustrate that ICS can consistently enhance LLMs’ performance. An in-depth evaluation with three data similarity-based ICS strategies suggests that these strategies can further elevate LLM’s performance, which sheds light on a new yet promising future research direction.
StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children’s Story-Based Learning
Jiaju Chen | Yuxuan Lu | Shao Zhang | Bingsheng Yao | Yuanzhe Dong | Ying Xu | Yunyao Li | Qianwen Wang | Dakuo Wang | Yuling Sun
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Jiaju Chen | Yuxuan Lu | Shao Zhang | Bingsheng Yao | Yuanzhe Dong | Ying Xu | Yunyao Li | Qianwen Wang | Dakuo Wang | Yuling Sun
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Interactive story reading is common in early childhood education, where teachers expect to teach both language skills and real-world knowledge beyond the story. While many story reading systems have been developed for this activity, they often fail to infuse real-world knowledge into the conversation. This limitation can be attributed to the existing question-answering (QA) datasets used for children’s education, upon which the systems are built, failing to capture the nuances of how education experts think when conducting interactive story reading activities. To bridge this gap, we design an annotation framework, empowered by existing knowledge graph to capture experts’ annotations and thinking process, and leverage this framework to construct StorySparkQA dataset, which comprises 5, 868 expert-annotated QA pairs with real-world knowledge. We conduct automated and human expert evaluations across various QA pair generation settings to demonstrate that our StorySparkQA can effectively support models in generating QA pairs that target real-world knowledge beyond story content. StorySparkQA is available at https://huggingface.co/datasets/NEU-HAI/StorySparkQA.
Search
Fix author
Co-authors
- Yuxuan Lu 3
- Dakuo Wang 3
- Bingsheng Yao 3
- Yisi Sang 2
- Xunliang Cai 1
- Xuezhi Cao 1
- Guiming Chen 1
- Jiaju Chen 1
- Yuanzhe Dong 1
- Hansu Gu 1
- James Hendler 1
- Chaoran Li 1
- Jiachen Li 1
- Tingyu Li 1
- Toby Jia-Jun Li 1
- Yunyao Li 1
- Sijia Liu 1
- Tun Lu 1
- Lin Qiu 1
- Junru Song 1
- Yuling Sun 1
- Qianwen Wang 1
- Xihuai Wang 1
- Xinbing Wang 1
- Yun Wang 1
- Ying Wen 1
- Ying Xu 1
- Wen Yao 1
- Peng Zhang 1
- Weinan Zhang 1
- Wenhao Zhang 1
- Ruishi Zou 1