Gang Pan

Other people with similar names: Gang Pan

Unverified author pages with similar names: Gang Pan


2026

Tokenizers play a critical role in large language model studies. Despite recent advances, existing tokenizers fail to explicitly leverage historical tokenization results when making subsequent token decisions, nor do they selectively utilize such history based on contextual relevance. We propose SPEAK, a tokenizer that integrates spiking neurons to explicitly leverage historical tokenization results. Furthermore, we introduce an entropy-aware reset mechanism that selectively leverages history based on contextual relevance, which is determined by token-level entropy. High-entropy tokens are treated as contextual boundaries, whereas low-entropy tokens between consecutive such boundaries exhibit strong contextual relevance. Accordingly, we induce hard reset at high-entropy tokens to discard irrelevant historical tokenization results, and soft reset at low-entropy tokens to preserve and leverage relevant history. Experiments on 2 language models and 5 datasets spanning 16 languages demonstrate superior cross-lingual adaptability, with competitive performance and efficiency. Our code is publicly available at https://github.com/zju-bmi-lab/SPEAK.

2025

Currently, leveraging large language models (LLMs) for autism intervention is a significant yet challenging task, particularly when directly employing LLMs as an intervention doctor. Researchers have mainly focused on using prompt engineering for role play as an intervention doctor and integrating auxiliary elements such as visual stimuli to enhance the sensory experience of the intervention, while neglecting the challenge that LLMs’ inherent dialogue style and intervention strategies do not meet the requirements of clinical dialogue interventions. To fill the gap, we propose a comprehensive framework for training LLMs to conduct dialogue interventions in accordance with the principles of Applied Behavior Analysis (ABA) which is commonly used by clinicians. Specifically, we collected clinical recordings of dialogue interventions for autistic children and constructed the topic dialogue dataset ASD-iLLM-8k. By incorporating the system prompt based on the ABA and ASD-iLLM-8k dataset, we fine-tuned LLMs to develop ASD-iLLM. We also proposed a role-play strategy in which LLMs act as autistic children to comprehensively evaluate the doctor model’s capabilities at the dialogue level. Extensive experiments indicate that ASD-iLLM outperforms existing models in both automatic and human evaluation, with intervention strategies and dialogue style more closely resembling those of clinical intervention doctors. Our dataset, model, and code are available on https://github.com/Shuzhong-Lai/ASD-iLLM.
Recent large pretrained models such as LLMs (e.g., GPT series) and VLAs (e.g., OpenVLA) have achieved notable progress on multimodal tasks, yet they are built upon a multi-input single-output (MISO) paradigm. We show that this paradigm fundamentally limits performance in multi-input multi-output (MIMO) scenarios, where parallel task execution is required. In MISO architectures, tasks compete for a shared output channel, creating mutual exclusion effects that cause unbalanced optimization and degraded performance. To address this gap, we introduce MIMO-VLA (VLASCD), a unified training framework that enables concurrent multi-task outputs, exemplified by simultaneous dialogue generation and decision-making. Inspired by human cognition, MIMO-VLA eliminates interference between tasks and supports efficient parallel processing. Experiments on the CARLA autonomous driving platform demonstrate that MIMO-VLA substantially outperforms state-of-the-art MISO-based LLMs, reinforcement learning models, and VLAs in MIMO settings, establishing a new direction for multimodal and multitask learning.