Lingling Shi

Also published as: 玲玲


2026

This paper describes our framework for SemEval-2026 Task 6 (CLARITY - Unmasking Political Question Evasions), which focuses on classifying clarity and fine-grained evasion types in political question-answering dialogues. We propose CAMSR-CoT, a confidence-aware multi-stage reasoning framework that unifies the two subtasks through hierarchical label modeling. The framework adopts a confidence-based routing strategy: high-certainty cases are directly resolved, while ambiguous samples are routed to deeper Chain-of-Thought reasoning stages with boundary-aware few-shot exemplars to mitigate label confusion. On the development set, our framework achieves Macro-F1 scores of 0.812 on SubTask 1 and 0.617 on SubTask 2. On the official hidden test set, it ranks 1st in both SubTask 1 (Macro-F1 = 0.89) and SubTask 2 (Macro-F1 = 0.68).
Large Language Models have shown strong performance in Machine Translation, yet they often suffer from paraphrasing errors, omissions, or hallucinations when the input contains translation-specific elements (e.g., URLs, slang, and idioms) that require strict preservation or controlled transformation, undermining the reliability of critical details.We propose CEMT, a Controllable Element-Oriented Machine Translation framework inspired by the analysis–strategy–generation paradigm in human translation. CEMT first employs an Element Detection Module to identify translation-specific elements, and then introduces a Translation Module that decomposes the translation process into linguistically grounded analysis, strategy formulation, and final generation, thereby guiding the reliable translation of these elements. We further introduce a CoT Judge model during training that provides step-wise supervision over the accuracy and consistency of the translation process.On the WMT23/24 Chinese–English benchmarks, CEMT improves performance over existing Machine Translation models while significantly reducing element-level constraint violations.

2023

“本文描述了队伍“翼智团”在CCL23古籍命名实体识别评测中提交的参赛系统。该任务旨在自动识别出古籍文本中人名、书名、官职名等事件基本构成要素的重要实体,并根据使用模型参数是否大于10b分为开放赛道和封闭赛道。该任务中,我们首先利用古籍相关的领域数据和任务数据对开源预训练模型进行持续预训练和微调,显著提升了基座模型在古籍命名实体识别任务上的性能表现。其次提出了一种基于pair-wise投票的不置信实体筛选算法用来得到候选实体,并对候选实体利用上下文增强策略进行实体识别修正。在最终的评估中,我们的系统在封闭赛道中排名第二,F1得分为95.8727。”