A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

Jingyi Zhang; Josef van Genabith

doi:10.18653/v1/2021.acl-long.24

A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment

Abstract

Word alignment and machine translation are two closely related tasks. Neural translation models, such as RNN-based and Transformer models, employ a target-to-source attention mechanism which can provide rough word alignments, but with a rather low accuracy. High-quality word alignment can help neural machine translation in many different ways, such as missing word detection, annotation transfer and lexicon injection. Existing methods for learning word alignment include statistical word aligners (e.g. GIZA++) and recently neural word alignment models. This paper presents a bidirectional Transformer based alignment (BTBA) model for unsupervised learning of the word alignment task. Our BTBA model predicts the current target word by attending the source context and both left-side and right-side target context to produce accurate target-to-source attention (alignment). We further fine-tune the target-to-source attention in the BTBA model to obtain better alignments using a full context based optimization method and self-supervised training. We test our method on three word alignment tasks and show that our method outperforms both previous neural word alignment approaches and the popular statistical word aligner GIZA++.

Anthology ID:: 2021.acl-long.24
Volume:: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:: August
Year:: 2021
Address:: Online
Venues:: ACL | IJCNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 283–292
Language:
URL:: https://aclanthology.org/2021.acl-long.24
DOI:: 10.18653/v1/2021.acl-long.24
Bibkey:
Cite (ACL):: Jingyi Zhang and Josef van Genabith. 2021. A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 283–292, Online. Association for Computational Linguistics.
Cite (Informal):: A Bidirectional Transformer Based Alignment Model for Unsupervised Word Alignment (Zhang & van Genabith, ACL-IJCNLP 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/auto-file-uploads/2021.acl-long.24.pdf
Video:: https://preview.aclanthology.org/auto-file-uploads/2021.acl-long.24.mp4

PDF Search Video