Hongye Liu
2025
Learning to Substitute Words with Model-based Score Ranking
Hongye Liu
|
Ricardo Henao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Smart word substitution aims to enhance sentence quality by improving word choices, however current benchmarks rely on human-labeled data , which suffers from subjectivity and lacks diversity due to limitations in the number of annotators. Since word choices are inherently subjective, ground-truth word substitutions generated by a small group of annotators are often incomplete and likely not generalizable. To circumvent this issue, we instead employ a model-based scoring (BARTScore) to quantify sentence quality, thus forgoing the need for human annotations. Specifically, we use this score to define a distribution for each word substitution, allowing one to test whether a substitution is statistically superior relative to others. Further, we propose a loss function that directly optimizes the alignment between model predictions and sentence scores, while also enhancing the overall quality score of a substitution. Crucially, model learning no longer requires human labels, thus avoiding the cost of annotation while maintaining the quality of the text modified with substitutions. Experimental results show that the proposed approach outperforms both masked language models (BERT, BART) and large language models (GPT-4, LLaMA).
2022
GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation
Mingzhou Xu
|
Longyue Wang
|
Derek F. Wong
|
Hongye Liu
|
Linfeng Song
|
Lidia S. Chao
|
Shuming Shi
|
Zhaopeng Tu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
The phenomenon of zero pronoun (ZP) has attracted increasing interest in the machine translation (MT) community due to its importance and difficulty. However, previous studies generally evaluate the quality of translating ZPs with BLEU scores on MT testsets, which is not expressive or sensitive enough for accurate assessment. To bridge the data and evaluation gaps, we propose a benchmark testset for target evaluation on Chinese-English ZP translation. The human-annotated testset covers five challenging genres, which reveal different characteristics of ZPs for comprehensive evaluation. We systematically revisit eight advanced models on ZP translation and identify current challenges for future exploration. We release data, code, models and annotation guidelines, which we hope can significantly promote research in this field (https://github.com/longyuewangdcu/mZPRT).
Search
Fix data
Co-authors
- Lidia S. Chao 1
- Ricardo Henao 1
- Shuming Shi 1
- Linfeng Song 1
- Zhaopeng Tu 1
- show all...