GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation

Huayang Li, Lemao Liu, Guoping Huang, Shuming Shi


Abstract
Computer-aided translation (CAT), the use of software to assist a human translator in the translation process, has been proven to be useful in enhancing the productivity of human translators. Autocompletion, which suggests translation results according to the text pieces provided by human translators, is a core function of CAT. There are two limitations in previous research in this line. First, most research works on this topic focus on sentence-level autocompletion (i.e., generating the whole translation as a sentence based on human input), but word-level autocompletion is under-explored so far. Second, almost no public benchmarks are available for the autocompletion task of CAT. This might be among the reasons why research progress in CAT is much slower compared to automatic MT. In this paper, we propose the task of general word-level autocompletion (GWLAN) from a real-world CAT scenario, and construct the first public benchmark to facilitate research in this topic. In addition, we propose an effective method for GWLAN and compare it with several strong baselines. Experiments demonstrate that our proposed method can give significantly more accurate predictions than the baseline methods on our benchmark datasets.
Anthology ID:
2021.acl-long.370
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
4792–4802
Language:
URL:
https://aclanthology.org/2021.acl-long.370
DOI:
10.18653/v1/2021.acl-long.370
Bibkey:
Cite (ACL):
Huayang Li, Lemao Liu, Guoping Huang, and Shuming Shi. 2021. GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 4792–4802, Online. Association for Computational Linguistics.
Cite (Informal):
GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation (Li et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.370.pdf
Video:
 https://preview.aclanthology.org/ingestion-script-update/2021.acl-long.370.mp4