Xianling Mao
Also published as: 先领 毛
2026
uir-cis-7 at SemEval-2026 Task 7: Zero-Shot Chain-of-Thought Reasoning for Cross-Cultural Daily Knowledge
Jianning Gao | Xianling Mao | Shumin Shi | Duanzhi Zhaxi | Yingbo Sun | Xiandeng Li | Binyang Li
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
Jianning Gao | Xianling Mao | Shumin Shi | Duanzhi Zhaxi | Yingbo Sun | Xiandeng Li | Binyang Li
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)
SemEval-2026 Task 7 evaluates the ability of Large Language Models (LLMs) to reason about diverse daily knowledge across 30 geographic regions. In this paper, team uir-cis-7 approaches this challenge not merely as an accuracy optimization problem, but as a diagnostic probe to evaluate the representational limits of LLMs without fine-tuning. To address Western-centric bias and the "overthinking penalty" frequently observed in high-resource contexts, we introduce a Two-Tier Dynamic Routing framework. Based on cultural resource density, queries are routed either to a direct-answer pathway or a complex reasoning pathway. The complex pathway utilizes an Anti-Bias Persona-Conditioned Chain-of-Thought enhanced with Knowledge Anchoring and multi-path Self-Consistency voting to mitigate majority-culture heuristics. Evaluated using a strict macro-average metric, our system achieved an overall accuracy of 89.02% on the official leaderboard. Our fine-grained evaluation and theoretical error analysis quantify the epistemological boundaries of prompt-based alignment, proving our dynamic strategy effectively rescues marginalized cultural knowledge while exposing persistent instances where safety-aligned models project Western progressive norms onto traditional contexts. Furthermore, cross-model validation on open-source architectures explicitly confirms our framework’s generalizability.
2024
生成式文本质量的自动评估方法综述(A Survey of Automatic Evaluation on the Quality of Generated Text)
Tian Lan (兰天) | Ziao Ma (马梓奥) | Yanghao Zhou (周杨浩) | Chen Xu (徐晨) | Xianling Mao (毛先领)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)
Tian Lan (兰天) | Ziao Ma (马梓奥) | Yanghao Zhou (周杨浩) | Chen Xu (徐晨) | Xianling Mao (毛先领)
Proceedings of the 23rd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum)
“人工评估,作为生成式文本质量评价的金标准,成本太高;自动评估,核心思想在于要使其评估结果与人工评估高度相关,从而实现对生成式文本质量的自动化分析和评价。随着自然语言处理领域相关技术的迭代进步,使得生成式文本质量的自动评估技术,已然经历了多次技术范式的迭代。然而,学界至今依然缺乏对生成式文本质量自动评估技术的系统化总结。因此,本文将首先系统地对已有的生成式文本自动评估方法进行归纳总结,然后分析了生成式文本自动评估方法的主要发展趋势,最后为了使读者更加宏观地了解自动评估整体,对自动评估领域整体的未来研究方向进行了探讨和展望。”
2020
Can Monolingual Pretrained Models Help Cross-Lingual Classification?
Zewen Chi | Li Dong | Furu Wei | Xianling Mao | Heyan Huang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Zewen Chi | Li Dong | Furu Wei | Xianling Mao | Heyan Huang
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing
Multilingual pretrained language models (such as multilingual BERT) have achieved impressive results for cross-lingual transfer. However, due to the constant model capacity, multilingual pre-training usually lags behind the monolingual competitors. In this work, we present two approaches to improve zero-shot cross-lingual classification, by transferring the knowledge from monolingual pretrained models to multilingual ones. Experimental results on two cross-lingual classification benchmarks show that our methods outperform vanilla multilingual fine-tuning.