Abstract
In this paper, we describe an extension to a hybrid machine translation system for handling dialect Arabic, using a decoding algorithm to normalize non-standard, spontaneous and dialectal Arabic into Modern Standard Arabic. We prove the feasibility of the approach by measuring and comparing machine translation results in terms of BLEU with and without the proposed approach. We show in our tests that on real-live broadcast input with transcriptions of dialectal speech we achieve an increase on BLEU of about 1%, and on web content with dialect text of about 2%.- Anthology ID:
- 2010.amta-papers.5
- Volume:
- Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers
- Month:
- October 31-November 4
- Year:
- 2010
- Address:
- Denver, Colorado, USA
- Venue:
- AMTA
- SIG:
- Publisher:
- Association for Machine Translation in the Americas
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2010.amta-papers.5
- DOI:
- Cite (ACL):
- Hassan Sawaf. 2010. Arabic Dialect Handling in Hybrid Machine Translation. In Proceedings of the 9th Conference of the Association for Machine Translation in the Americas: Research Papers, Denver, Colorado, USA. Association for Machine Translation in the Americas.
- Cite (Informal):
- Arabic Dialect Handling in Hybrid Machine Translation (Sawaf, AMTA 2010)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2010.amta-papers.5.pdf