Toward Traditional Chinese ModernBERT: A Preliminary Study

Yi-En Chen, Qiao-Ying He, Kuan-Yu Chen


Abstract
This study employs several state-of-the-art techniques, including RoPE and Flash Attention, and leverages large-scale Chinese web corpora and encyclopedic data to pre-train an encoder model specifically designed for long text in Traditional Chinese. We evaluate the model on tasks such as reading comprehension and text classification, and the results show that its overall performance lags behind existing Chinese benchmarks. Through pseudo-perplexity analysis, we infer that the pre-training phase did not sufficiently capture the data distribution, potentially due to factors such as hyperparameters, convergence, and data quality. Although the results are suboptimal, this study still offers valuable experimental insights and directions for improving Chinese language model development.
Anthology ID:
2025.rocling-main.16
Volume:
Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025)
Month:
November
Year:
2025
Address:
National Taiwan University, Taipei City, Taiwan
Editors:
Kai-Wei Chang, Ke-Han Lu, Chih-Kai Yang, Zhi-Rui Tam, Wen-Yu Chang, Chung-Che Wang
Venue:
ROCLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
133–139
Language:
URL:
https://preview.aclanthology.org/dashboard/2025.rocling-main.16/
DOI:
Bibkey:
Cite (ACL):
Yi-En Chen, Qiao-Ying He, and Kuan-Yu Chen. 2025. Toward Traditional Chinese ModernBERT: A Preliminary Study. In Proceedings of the 37th Conference on Computational Linguistics and Speech Processing (ROCLING 2025), pages 133–139, National Taiwan University, Taipei City, Taiwan. Association for Computational Linguistics.
Cite (Informal):
Toward Traditional Chinese ModernBERT: A Preliminary Study (Chen et al., ROCLING 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/dashboard/2025.rocling-main.16.pdf