Lingling Shi (石玲玲)

Lingling Shi

Also published as: 玲玲石

2026

Large Language Models have shown strong performance in Machine Translation, yet they often suffer from paraphrasing errors, omissions, or hallucinations when the input contains translation-specific elements (e.g., URLs, slang, and idioms) that require strict preservation or controlled transformation, undermining the reliability of critical details.We propose CEMT, a Controllable Element-Oriented Machine Translation framework inspired by the analysis–strategy–generation paradigm in human translation. CEMT first employs an Element Detection Module to identify translation-specific elements, and then introduces a Translation Module that decomposes the translation process into linguistically grounded analysis, strategy formulation, and final generation, thereby guiding the reliable translation of these elements. We further introduce a CoT Judge model during training that provides step-wise supervision over the accuracy and consistency of the translation process.On the WMT23/24 Chinese–English benchmarks, CEMT improves performance over existing Machine Translation models while significantly reducing element-level constraint violations.

2023

pdf bib abs

CCL23-Eval 任务1系统报告:基于持续预训练方法与上下文增强策略的古籍命名实体识别(System Report for CCL23-Eval Task 1:Named Entity Recognition for Ancient Books based on Continual Pre-training Method and Context Augmentation Strategy)
Shiquan Wang (士权王,) | Lingling Shi (石玲玲) | Luwen Pu (蒲璐汶) | Ruiyu Fang (方瑞玉) | Yu Zhao (宇赵,) | Shuangyong Song (宋双永)
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)

“本文描述了队伍“翼智团”在CCL23古籍命名实体识别评测中提交的参赛系统。该任务旨在自动识别出古籍文本中人名、书名、官职名等事件基本构成要素的重要实体,并根据使用模型参数是否大于10b分为开放赛道和封闭赛道。该任务中,我们首先利用古籍相关的领域数据和任务数据对开源预训练模型进行持续预训练和微调,显著提升了基座模型在古籍命名实体识别任务上的性能表现。其次提出了一种基于pair-wise投票的不置信实体筛选算法用来得到候选实体,并对候选实体利用上下文增强策略进行实体识别修正。在最终的评估中,我们的系统在封闭赛道中排名第二,F1得分为95.8727。”

Co-authors

Luwen Pu 1

Jinsong Su 1

Shiquan Wang 1

Yu Zhao 1

Venues

CCL1
Findings1

Fix author