Mixture of Small and Large Models for Chinese Spelling Check

Ziheng Qiao, Houquan Zhou, Zhenghua Li


Abstract
In the era of large language models (LLMs), the Chinese Spelling Check (CSC) task has seen various LLM methods developed, yet their performance remains unsatisfactory. In contrast, fine-tuned BERT-based models, relying on high-quality in-domain data, show excellent performance but suffer from edit pattern overfitting. This paper proposes a novel dynamic mixture approach that effectively combines the probability distributions of small models and LLMs during the beam search decoding phase, achieving a balanced enhancement of precise corrections from small models and the fluency of LLMs. This approach also eliminates the need for fine-tuning LLMs, saving significant time and resources, and facilitating domain adaptation. Comprehensive experiments demonstrate that our mixture approach significantly boosts error correction capabilities, achieving state-of-the-art results across multiple datasets. Our code is available at https://github.com/zhqiao-nlp/MSLLM.
Anthology ID:
2025.acl-long.1372
Volume:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
28298–28311
Language:
URL:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1372/
DOI:
Bibkey:
Cite (ACL):
Ziheng Qiao, Houquan Zhou, and Zhenghua Li. 2025. Mixture of Small and Large Models for Chinese Spelling Check. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 28298–28311, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Mixture of Small and Large Models for Chinese Spelling Check (Qiao et al., ACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-acl-25/2025.acl-long.1372.pdf