Combining fast_align with Hierarchical Sub-sentential Alignment for Better Word Alignments

Hao Wang, Yves Lepage


Abstract
fast align is a simple and fast word alignment tool which is widely used in state-of-the-art machine translation systems. It yields comparable results in the end-to-end translation experiments of various language pairs. However, fast align does not perform as well as GIZA++ when applied to language pairs with distinct word orders, like English and Japanese. In this paper, given the lexical translation table output by fast align, we propose to realign words using the hierarchical sub-sentential alignment approach. Experimental results show that simple additional processing improves the performance of word alignment, which is measured by counting alignment matches in comparison with fast align. We also report the result of final machine translation in both English-Japanese and Japanese-English. We show our best system provided significant improvements over the baseline as measured by BLEU and RIBES.
Anthology ID:
W16-4501
Volume:
Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6)
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Patrik Lambert, Bogdan Babych, Kurt Eberle, Rafael E. Banchs, Reinhard Rapp, Marta R. Costa-jussà
Venue:
HyTra
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
1–7
Language:
URL:
https://aclanthology.org/W16-4501
DOI:
Bibkey:
Cite (ACL):
Hao Wang and Yves Lepage. 2016. Combining fast_align with Hierarchical Sub-sentential Alignment for Better Word Alignments. In Proceedings of the Sixth Workshop on Hybrid Approaches to Translation (HyTra6), pages 1–7, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Combining fast_align with Hierarchical Sub-sentential Alignment for Better Word Alignments (Wang & Lepage, HyTra 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/W16-4501.pdf