Min He
2026
Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference
Zhouxuwen | Fangxin Liu | Chao Wang | Xiao Zheng | Hao Zheng | Min He | Li Jiang | Haibing Guan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Zhouxuwen | Fangxin Liu | Chao Wang | Xiao Zheng | Hao Zheng | Min He | Li Jiang | Haibing Guan
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Speculative decoding accelerates autoregressive generation by letting draft tokens bypass full verification, but conventional frameworks suffer from frequent false rejections, particularly when draft models produce semantically correct but lexically divergent outputs. In this paper, we present Calibrated Speculative Decoding (CSD), a training-free framework that recovers valid tokens discarded by standard verification. Guided by the principle of "Frequency-Guided Candidate Selection and Probability-Guarded Acceptance," CSD incorporates two lightweight modules: Online Correction Memory, which aggregates historical rejections to propose recurring divergence patterns as rescue candidates, and Semantic Consistency Gating, which verifies candidate admissibility using probability ratios instead of exact token matching. Our evaluation across diverse large language models demonstrates that CSD outperforms existing methods, achieving a peak throughput speedup of 2.33x. CSD preserves model accuracy across all tasks while further boosting performance on complex reasoning datasets. These results establish CSD as a highly effective, lightweight solution for practical LLM deployments.
New Terms, New Toxicity: Consensus-based Chinese Neologism Toxicity Detection via Search-Augmented LLMs
Shiyao Cui | QingLin Zhang | Di Wang | Yida Lu | Zhexin Zhang | Jinhua Gao | Jinglin Yang | Min He | Han Qiu | Minlie Huang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Shiyao Cui | QingLin Zhang | Di Wang | Yida Lu | Zhexin Zhang | Jinhua Gao | Jinglin Yang | Min He | Han Qiu | Minlie Huang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Neologisms, emerging terms in meaning or form, can serve as new vehicles for toxic expression, like "田园女" ("country girl") as a stigmatizing label targeting feminism. Such toxic neologisms appear benign but have evolved into toxic usage in public consensus, posing challenges to moderation systems and remaining underexplored. In this paper, we investigate how to detect implicit toxicity expressed via neologisms. We first propose a taxonomy that captures the origins and consensus-verification criteria of toxic neologisms, followed by the construction of a lexicon spanning widely observed risk categories. To capture toxicity grounded in public consensus, we introduce **SeTox**, a search-augmented framework that enables static large language models (LLMs) to incorporate real-time web context for neologism toxicity detection. Experiments show that **SeTox**, even with 3B-scale models, outperforms recent large-scale models, demonstrating its scalability to incorporate real-world knowledge for toxic neologism detection. **Disclaimer**: this paper has offensive contents that may be disturbing to some readers.