ACSE: An Ancient Character Semantic-Aware Embedding for Large Language Models
Zhihan Zhou, Daqian Shi, Lida Shi, Rui Song, Peiqiang Qiu, Xiaolei Diao, Hao Xu
Abstract
Research on ancient Chinese language is of great significance for tracing Chinese history and civilization. In the field of large language models, studies on the pre-Qin excavated documents such as Oracle Bone Inscriptions, Bronze Inscriptions, and Bamboo Book of Chu remain insufficient. This is because these ancient characters have a low level of digitization, training corpora are extremely scarce, and they typically contain complex and rich semantic information. Therefore, we propose an ancient character semantic-aware embedding for large language models. This embedding integrates both the glyph and lexicality of ancient characters and maps them to the modern Chinese semantic space. We also design a two-stage method for lightweight and parameter-efficient training of the embedding. Finally, we conduct extensive experiments on excavated documents from the pre-Qin period, and the results demonstrate the effectiveness of our approach.- Anthology ID:
- 2026.findings-acl.437
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 9000–9012
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.437/
- DOI:
- Cite (ACL):
- Zhihan Zhou, Daqian Shi, Lida Shi, Rui Song, Peiqiang Qiu, Xiaolei Diao, and Hao Xu. 2026. ACSE: An Ancient Character Semantic-Aware Embedding for Large Language Models. In Findings of the Association for Computational Linguistics: ACL 2026, pages 9000–9012, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- ACSE: An Ancient Character Semantic-Aware Embedding for Large Language Models (Zhou et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.437.pdf