2023
pdf
abs
SmartSpanNER: Making SpanNER Robust in Low Resource Scenarios
Min Zhang
|
Xiaosong Qiao
|
Yanqing Zhao
|
Shimin Tao
|
Hao Yang
Findings of the Association for Computational Linguistics: EMNLP 2023
Named Entity Recognition (NER) is one of the most fundamental tasks in natural language processing. Span-level prediction (SpanNER) is more naturally suitable for nested NER than sequence labeling (SeqLab). However, according to our experiments, the SpanNER method is more sensitive to the amount of training data, i.e., the F1 score of SpanNER drops much more than that of SeqLab when the amount of training data drops. In order to improve the robustness of SpanNER in low resource scenarios, we propose a simple and effective method SmartSpanNER, which introduces a Named Entity Head (NEH) prediction task to SpanNER and performs multi-task learning together with the task of span classification. Experimental results demonstrate that the robustness of SpanNER could be greatly improved by SmartSpanNER in low resource scenarios constructed on the CoNLL03, Few-NERD, GENIA and ACE05 standard benchmark datasets.
2022
pdf
abs
Part Represents Whole: Improving the Evaluation of Machine Translation System Using Entropy Enhanced Metrics
Yilun Liu
|
Shimin Tao
|
Chang Su
|
Min Zhang
|
Yanqing Zhao
|
Hao Yang
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022
Machine translation (MT) metrics often experience poor correlations with human assessments. In terms of MT system evaluation, most metrics pay equal attentions to every sample in an evaluation set, while in human evaluation, difficult sentences often make candidate systems distinguishable via notable fluctuations in human scores, especially when systems are competitive. We find that samples with high entropy values, which though usually count less than 5%, tend to play a key role in MT evaluation: when the evaluation set is shrunk to only the high-entropy portion, correlations with human assessments are actually improved. Thus, in this paper, we propose a fast and unsupervised approach to enhance MT metrics using entropy, expanding the dimension of evaluation by introducing sentence-level difficulty. A translation hypothesis with a significantly high entropy value is considered difficult and receives a large weight in aggregation of system-level scores. Experimental results on five sub-tracks in the WMT19 Metrics shared tasks show that our proposed method significantly enhanced the performance of commonly-used MT metrics in terms of system-level correlations with human assessments, even outperforming existing SOTA metrics. In particular, all enhanced metrics exhibit overall stability in correlations with human assessments in circumstances where only competitive MT systems are included, while the corresponding vanilla metrics fail to correlate with human assessments.
pdf
abs
Partial Could Be Better than Whole. HW-TSC 2022 Submission for the Metrics Shared Task
Yilun Liu
|
Xiaosong Qiao
|
Zhanglin Wu
|
Su Chang
|
Min Zhang
|
Yanqing Zhao
|
Song Peng
|
Shimin Tao
|
Hao Yang
|
Ying Qin
|
Jiaxin Guo
|
Minghan Wang
|
Yinglu Li
|
Peng Li
|
Xiaofeng Zhao
Proceedings of the Seventh Conference on Machine Translation (WMT)
In this paper, we present the contribution of HW-TSC to WMT 2022 Metrics Shared Task. We propose one reference-based metric, HWTSC-EE-BERTScore*, and four referencefree metrics including HWTSC-Teacher-Sim, HWTSC-TLM, KG-BERTScore and CROSSQE. Among these metrics, HWTSC-Teacher-Sim and CROSS-QE are supervised, whereas HWTSC-EE-BERTScore*, HWTSC-TLM and KG-BERTScore are unsupervised. We use these metrics in the segment-level and systemlevel tracks. Overall, our systems achieve strong results for all language pairs on previous test sets and a new state-of-the-art in many sys-level case sets.
pdf
abs
HW-TSC Translation Systems for the WMT22 Chat Translation Task
Jinlong Yang
|
Zongyao Li
|
Daimeng Wei
|
Hengchao Shang
|
Xiaoyu Chen
|
Zhengzhe Yu
|
Zhiqiang Rao
|
Shaojun Li
|
Zhanglin Wu
|
Yuhao Xie
|
Yuanchang Luo
|
Ting Zhu
|
Yanqing Zhao
|
Lizhi Lei
|
Hao Yang
|
Ying Qin
Proceedings of the Seventh Conference on Machine Translation (WMT)
This paper describes the submissions of Huawei Translation Services Center (HW-TSC) to WMT22 chat translation shared task on English-Germany (en-de) bidirection with results of zore-shot and few-shot tracks. We use the deep transformer architecture with a lager parameter size. Our submissions to the WMT21 News Translation task are used as the baselines. We adopt strategies such as back translation, forward translation, domain transfer, data selection, and noisy forward translation in task, and achieve competitive results on the development set. We also test the effectiveness of document translation on chat tasks. Due to the lack of chat data, the results on the development set show that it is not as effective as sentence-level translation models.
2012
pdf
A CRF Sequence Labeling Approach to Chinese Punctuation Prediction
Yanqing Zhao
|
Chaoyue Wang
|
Guohong Fu
Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation