Enhancing Chinese Offensive Language Detection with Homophonic Perturbation

Junqi Wu, Shujie Ji, Kang Zhong, Huiling Peng, Zhendongxiao, Xiongding Liu, Wu Wei


Abstract
Detecting offensive language in Chinese is challenging due to homophonic substitutions used to evade detection. We propose a framework to improve large language models’ robustness against such phonetic attacks. First, we construct HED-COLD, the first large-scale and systematic homophonic dataset for Chinese offensive language detection. Additionally, we design a homophone-aware pretraining strategy that learns the mappings among orthography, phonetics, and semantics between original and perturbed text. Experimental results show that our approach achieves state-of-the-art performance on both the COLD test set and the toxicity benchmark ToxiCloakCN. Notably, it achieves greater gains in domains susceptible to homophonic attacks, such as gender and regional content. These results demonstrate improved robustness and generalization against phonetic adversarial attacks.
Anthology ID:
2025.emnlp-main.1154
Volume:
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:
November
Year:
2025
Address:
Suzhou, China
Editors:
Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
22671–22686
Language:
URL:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-main.1154/
DOI:
10.18653/v1/2025.emnlp-main.1154
Bibkey:
Cite (ACL):
Junqi Wu, Shujie Ji, Kang Zhong, Huiling Peng, Zhendongxiao, Xiongding Liu, and Wu Wei. 2025. Enhancing Chinese Offensive Language Detection with Homophonic Perturbation. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 22671–22686, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):
Enhancing Chinese Offensive Language Detection with Homophonic Perturbation (Wu et al., EMNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-page-yu-wang-polytechnic/2025.emnlp-main.1154.pdf
Checklist:
 2025.emnlp-main.1154.checklist.pdf