Gang Pan
Other people with similar names: Gang Pan
Unverified author pages with similar names: Gang Pan
2026
SPEAK: Spiking Neurons as an Entropy-Aware Tokenizer for Large Language Models
Ming Chen | Wenyao Li | Chao Liang | Shi Gu | Peng Lin | De Ma | Huajin Tang | Qian Zheng | Gang Pan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Ming Chen | Wenyao Li | Chao Liang | Shi Gu | Peng Lin | De Ma | Huajin Tang | Qian Zheng | Gang Pan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Tokenizers play a critical role in large language model studies. Despite recent advances, existing tokenizers fail to explicitly leverage historical tokenization results when making subsequent token decisions, nor do they selectively utilize such history based on contextual relevance. We propose SPEAK, a tokenizer that integrates spiking neurons to explicitly leverage historical tokenization results. Furthermore, we introduce an entropy-aware reset mechanism that selectively leverages history based on contextual relevance, which is determined by token-level entropy. High-entropy tokens are treated as contextual boundaries, whereas low-entropy tokens between consecutive such boundaries exhibit strong contextual relevance. Accordingly, we induce hard reset at high-entropy tokens to discard irrelevant historical tokenization results, and soft reset at low-entropy tokens to preserve and leverage relevant history. Experiments on 2 language models and 5 datasets spanning 16 languages demonstrate superior cross-lingual adaptability, with competitive performance and efficiency. Our code is publicly available at https://github.com/zju-bmi-lab/SPEAK.
2025
ASD-iLLM:An Intervention Large Language Model for Autistic Children based on Real Clinical Dialogue Intervention Dataset
Shuzhong Lai | Chenxi Li | Junhong Lai | Yucun Zhong | Chenyu Yan | Xiang Li | Haifeng Li | Gang Pan | Lin Yao | Yueming Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Shuzhong Lai | Chenxi Li | Junhong Lai | Yucun Zhong | Chenyu Yan | Xiang Li | Haifeng Li | Gang Pan | Lin Yao | Yueming Wang
Findings of the Association for Computational Linguistics: EMNLP 2025
Currently, leveraging large language models (LLMs) for autism intervention is a significant yet challenging task, particularly when directly employing LLMs as an intervention doctor. Researchers have mainly focused on using prompt engineering for role play as an intervention doctor and integrating auxiliary elements such as visual stimuli to enhance the sensory experience of the intervention, while neglecting the challenge that LLMs’ inherent dialogue style and intervention strategies do not meet the requirements of clinical dialogue interventions. To fill the gap, we propose a comprehensive framework for training LLMs to conduct dialogue interventions in accordance with the principles of Applied Behavior Analysis (ABA) which is commonly used by clinicians. Specifically, we collected clinical recordings of dialogue interventions for autistic children and constructed the topic dialogue dataset ASD-iLLM-8k. By incorporating the system prompt based on the ABA and ASD-iLLM-8k dataset, we fine-tuned LLMs to develop ASD-iLLM. We also proposed a role-play strategy in which LLMs act as autistic children to comprehensively evaluate the doctor model’s capabilities at the dialogue level. Extensive experiments indicate that ASD-iLLM outperforms existing models in both automatic and human evaluation, with intervention strategies and dialogue style more closely resembling those of clinical intervention doctors. Our dataset, model, and code are available on https://github.com/Shuzhong-Lai/ASD-iLLM.
VLASCD: A Visual Language Action Model for Simultaneous Chatting and Decision Making
Zuojin Tang | Bin Hu | Chenyang Zhao | De Ma | Gang Pan | Bin Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Zuojin Tang | Bin Hu | Chenyang Zhao | De Ma | Gang Pan | Bin Liu
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Recent large pretrained models such as LLMs (e.g., GPT series) and VLAs (e.g., OpenVLA) have achieved notable progress on multimodal tasks, yet they are built upon a multi-input single-output (MISO) paradigm. We show that this paradigm fundamentally limits performance in multi-input multi-output (MIMO) scenarios, where parallel task execution is required. In MISO architectures, tasks compete for a shared output channel, creating mutual exclusion effects that cause unbalanced optimization and degraded performance. To address this gap, we introduce MIMO-VLA (VLASCD), a unified training framework that enables concurrent multi-task outputs, exemplified by simultaneous dialogue generation and decision-making. Inspired by human cognition, MIMO-VLA eliminates interference between tasks and supports efficient parallel processing. Experiments on the CARLA autonomous driving platform demonstrate that MIMO-VLA substantially outperforms state-of-the-art MISO-based LLMs, reinforcement learning models, and VLAs in MIMO settings, establishing a new direction for multimodal and multitask learning.