Exploring the Application of 7B LLMs for Named Entity Recognition in Chinese Ancient Texts

Chenrui Zheng, Yicheng Zhu, Han Bi


Abstract
This paper explores the application of fine-tuning methods based on 7B large language models (LLMs) for named entity recognition (NER) tasks in Chinese ancient texts. Targeting the complex semantics and domain-specific characteristics of ancient texts, particularly in Traditional Chinese Medicine (TCM) texts, we propose a comprehensive fine-tuning and pre-training strategy. By introducing multi-task learning, domain-specific pre-training, and efficient fine-tuning techniques based on LoRA, we achieved significant performance improvements in ancient text NER tasks. Experimental results show that the pre-trained and fine-tuned 7B model achieved an F1 score of 0.93, significantly outperforming general-purpose large language models.
Anthology ID:
2025.alp-1.18
Volume:
Proceedings of the Second Workshop on Ancient Language Processing
Month:
May
Year:
2025
Address:
The Albuquerque Convention Center, Laguna
Editors:
Adam Anderson, Shai Gordin, Bin Li, Yudong Liu, Marco C. Passarotti, Rachele Sprugnoli
Venues:
ALP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
150–155
Language:
URL:
https://preview.aclanthology.org/moar-dois/2025.alp-1.18/
DOI:
10.18653/v1/2025.alp-1.18
Bibkey:
Cite (ACL):
Chenrui Zheng, Yicheng Zhu, and Han Bi. 2025. Exploring the Application of 7B LLMs for Named Entity Recognition in Chinese Ancient Texts. In Proceedings of the Second Workshop on Ancient Language Processing, pages 150–155, The Albuquerque Convention Center, Laguna. Association for Computational Linguistics.
Cite (Informal):
Exploring the Application of 7B LLMs for Named Entity Recognition in Chinese Ancient Texts (Zheng et al., ALP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/moar-dois/2025.alp-1.18.pdf