Challenges in Building an Arabic-English GHMT System with SMT Components

Nizar Habash, Bonnie Dorr, Christof Monz


Abstract
The research context of this paper is developing hybrid machine translation (MT) systems that exploit the advantages of linguistic rule-based and statistical MT systems. Arabic, as a morphologically rich language, is especially challenging even without addressing the hybridization question. In this paper, we describe the challenges in building an Arabic-English generation-heavy machine translation (GHMT) system and boosting it with statistical machine translation (SMT) components. We present an extensive evaluation of multiple system variants and report positive results on the advantages of hybridization.
Anthology ID:
2006.amta-papers.7
Volume:
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers
Month:
August 8-12
Year:
2006
Address:
Cambridge, Massachusetts, USA
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
56–65
Language:
URL:
https://aclanthology.org/2006.amta-papers.7
DOI:
Bibkey:
Cite (ACL):
Nizar Habash, Bonnie Dorr, and Christof Monz. 2006. Challenges in Building an Arabic-English GHMT System with SMT Components. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, pages 56–65, Cambridge, Massachusetts, USA. Association for Machine Translation in the Americas.
Cite (Informal):
Challenges in Building an Arabic-English GHMT System with SMT Components (Habash et al., AMTA 2006)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2006.amta-papers.7.pdf