2023
pdf
abs
KG-IQES: An Interpretable Quality Estimation System for Machine Translation Based on Knowledge Graph
Junhao Zhu
|
Min Zhang
|
Hao Yang
|
Song Peng
|
Zhanglin Wu
|
Yanfei Jiang
|
Xijun Qiu
|
Weiqiang Pan
|
Ming Zhu
|
Ma Miaomiao
|
Weidong Zhang
Proceedings of Machine Translation Summit XIX, Vol. 2: Users Track
The widespread use of machine translation (MT) has driven the need for effective automatic quality estimation (AQE) methods. How to enhance the interpretability of MT output quality estimation is well worth exploring in the industry. From the perspective of the alignment of named entities (NEs) in the source and translated sentences, we construct a multilingual knowledge graph (KG) consisting of domain-specific NEs, and design a KG-based interpretable quality estimation (QE) system for machine translations (KG-IQES). KG-IQES effectively estimates the translation quality without relying on reference translations. Its effectiveness has been verified in our business scenarios.
pdf
abs
Empowering a Metric with LLM-assisted Named Entity Annotation: HW-TSC’s Submission to the WMT23 Metrics Shared Task
Zhanglin Wu
|
Yilun Liu
|
Min Zhang
|
Xiaofeng Zhao
|
Junhao Zhu
|
Ming Zhu
|
Xiaosong Qiao
|
Jingfei Zhang
|
Ma Miaomiao
|
Zhao Yanqing
|
Song Peng
|
Shimin Tao
|
Hao Yang
|
Yanfei Jiang
Proceedings of the Eighth Conference on Machine Translation
This paper presents the submission of Huawei Translation Service Center (HW-TSC) to the WMT23 metrics shared task, in which we submit two metrics: KG-BERTScore and HWTSC-EE-Metric. Among them, KG-BERTScore is our primary submission for the reference-free metric, which can provide both segment-level and system-level scoring. While HWTSC-EE-Metric is our primary submission for the reference-based metric, which can only provide system-level scoring. Overall, our metrics show relatively high correlations with MQM scores on the metrics tasks of previous years. Especially on system-level scoring tasks, our metrics achieve new state-of-the-art in many language pairs.
pdf
abs
HW-TSC’s Participation in the WMT 2023 Automatic Post Editing Shared Task
Jiawei Yu
|
Min Zhang
|
Zhao Yanqing
|
Xiaofeng Zhao
|
Yuang Li
|
Su Chang
|
Yinglu Li
|
Ma Miaomiao
|
Shimin Tao
|
Hao Yang
Proceedings of the Eighth Conference on Machine Translation
The paper presents the submission by HW-TSC in the WMT 2023 Automatic Post Editing (APE) shared task for the English-Marathi (En-Mr) language pair. Our method encompasses several key steps. First, we pre-train an APE model by utilizing synthetic APE data provided by the official task organizers. Then, we fine-tune the model by employing real APE data. For data augmentation, we incorporate candidate translations obtained from an external Machine Translation (MT) system. Furthermore, we integrate the En-Mr parallel corpus from the Flores-200 dataset into our training data. To address the overfitting issue, we employ R-Drop during the training phase. Given that APE systems tend to exhibit a tendency of ‘over-correction’, we employ a sentence-level Quality Estimation (QE) system to select the final output, deciding between the original translation and the corresponding output generated by the APE model. Our experiments demonstrate that pre-trained APE models are effective when being fine-tuned with the APE corpus of a limited size, and the performance can be further improved with external MT augmentation. Our approach improves the TER and BLEU scores on the development set by -2.42 and +3.76 points, respectively.
2022
pdf
abs
CrossQE: HW-TSC 2022 Submission for the Quality Estimation Shared Task
Shimin Tao
|
Su Chang
|
Ma Miaomiao
|
Hao Yang
|
Xiang Geng
|
Shujian Huang
|
Min Zhang
|
Jiaxin Guo
|
Minghan Wang
|
Yinglu Li
Proceedings of the Seventh Conference on Machine Translation (WMT)
Quality estimation (QE) is a crucial method to investigate automatic methods for estimating the quality of machine translation results without reference translations. This paper presents Huawei Translation Services Center’s (HW-TSC’s) work called CrossQE in WMT 2022 QE shared tasks 1 and 2, namely sentence- and word- level quality prediction and explainable QE.CrossQE employes the framework of predictor-estimator for task 1, concretely with a pre-trained cross-lingual XLM-RoBERTa large as predictor and task-specific classifier or regressor as estimator. An extensive set of experimental results show that after adding bottleneck adapter layer, mean teacher loss, masked language modeling task loss and MC dropout methods in CrossQE, the performance has improved to a certain extent. For task 2, CrossQE calculated the cosine similarity between each word feature in the target and each word feature in the source by task 1 sentence-level QE system’s predictor, and used the inverse value of maximum similarity between each word in the target and the source as the word translation error risk value. Moreover, CrossQE has outstanding performance on QE test sets of WMT 2022.