Shenghan Xu
2025
When Less Is More: Logits-Constrained Framework with RoBERTa for Ancient Chinese NER
Wenjie Hua
|
Shenghan Xu
Proceedings of the Second Workshop on Ancient Language Processing
This report presents our team’s work on ancient Chinese Named Entity Recognition (NER) for EvaHan 20251. We propose a two-stage framework combining GujiRoBERTa with a Logits-Constrained (LC) mechanism. The first stage generates contextual embeddings using GujiRoBERTa, followed by dynamically masked decoding to enforce valid BMES transitions. Experiments on EvaHan 2025 datasets demonstrate the framework’s effectiveness. Key findings include the LC framework’s superiority over CRFs in high-label scenarios and the detrimental effect of BiLSTM modules. We also establish empirical model selection guidelines based on label complexity and dataset size.