Runzhe Zhan


2022

pdf bib
中国语言学研究 70 年:核心期刊的词汇增长(70 Years of Linguistics Research in China: Vocabulary Growth of Core Journals)
Shan Wang (王珊) | Runzhe Zhan (詹润哲) | Shuangyun Yao (姚双云)
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“建国以来我国语言学经过 70 年的发展取得了瞩目的成就,已有研究主要以回顾主要历史事件的方式介绍这一进程,但尚缺少使用量化手段分析其历时发展的研究。本文以词汇增长为切入点探究这一主题,首次创建大规模语言学中文核心期刊摘要的历时语料库,并使用三大词汇增长模型预测语料库中词汇的变化。本文选择拟合效果最好的 Heaps 模型分阶段深入分析语言学词汇的变化,显示出国家政策的指导作用和特定时代的语言生活特征。此外,与时序无关的验证程序支撑了本文研究方法的有效性。 关键词:中国语言学;词汇增长;核心期刊;摘要;语料库;历时发展”

2021

pdf
Difficulty-Aware Machine Translation Evaluation
Runzhe Zhan | Xuebo Liu | Derek F. Wong | Lidia S. Chao
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

The high-quality translation results produced by machine translation (MT) systems still pose a huge challenge for automatic evaluation. Current MT evaluation pays the same attention to each sentence component, while the questions of real-world examinations (e.g., university examinations) have different difficulties and weightings. In this paper, we propose a novel difficulty-aware MT evaluation metric, expanding the evaluation dimension by taking translation difficulty into consideration. A translation that fails to be predicted by most MT systems will be treated as a difficult one and assigned a large weight in the final score function, and conversely. Experimental results on the WMT19 English-German Metrics shared tasks show that our proposed method outperforms commonly used MT metrics in terms of human correlation. In particular, our proposed method performs well even when all the MT systems are very competitive, which is when most existing metrics fail to distinguish between them. The source code is freely available at https://github.com/NLP2CT/Difficulty-Aware-MT-Evaluation.