Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT

Necip Fazil Ayan, Bonnie Dorr, Nizar Habash


Abstract
An adaptable statistical or hybrid MT system relies heavily on the quality of word-level alignments of real-world data. Statistical alignment approaches provide a reasonable initial estimate for word alignment. However, they cannot handle certain types of linguistic phenomena such as long-distance dependencies and structural differences between languages. We address this issue in Multi-Align, a new framework for incremental testing of different alignment algorithms and their combinations. Our design allows users to tune their systems to the properties of a particular genre/domain while still benefiting from general linguistic knowledge associated with a language pair. We demonstrate that a combination of statistical and linguistically-informed alignments can resolve translation divergences during the alignment process.
Anthology ID:
2004.amta-papers.3
Volume:
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
September 28 - October 2
Year:
2004
Address:
Washington, USA
Venue:
AMTA
SIG:
Publisher:
Springer
Note:
Pages:
17–26
Language:
URL:
https://link.springer.com/chapter/10.1007/978-3-540-30194-3_3
DOI:
Bibkey:
Cite (ACL):
Necip Fazil Ayan, Bonnie Dorr, and Nizar Habash. 2004. Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT. In Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 17–26, Washington, USA. Springer.
Cite (Informal):
Multi-Align: combining linguistic and statistical techniques to improve alignments for adaptable MT (Ayan et al., AMTA 2004)
Copy Citation:
PDF:
https://link.springer.com/chapter/10.1007/978-3-540-30194-3_3