Abstract
In this paper, we investigate different methodologies of Arabic segmentation for statistical machine translation by comparing a rule-based segmenter to different statistically-based segmenters. We also present a new method for segmentation that serves the need for a real-time translation system without impairing the translation accuracy.- Anthology ID:
- 2010.iwslt-papers.15
- Volume:
- Proceedings of the 7th International Workshop on Spoken Language Translation: Papers
- Month:
- December 2-3
- Year:
- 2010
- Address:
- Paris, France
- Venue:
- IWSLT
- SIG:
- SIGSLT
- Publisher:
- Note:
- Pages:
- 321–327
- Language:
- URL:
- https://preview.aclanthology.org/icon-24-ingestion/2010.iwslt-papers.15/
- DOI:
- Cite (ACL):
- Saab Mansour. 2010. MorphTagger: HMM-based Arabic segmentation for statistical machine translation. In Proceedings of the 7th International Workshop on Spoken Language Translation: Papers, pages 321–327, Paris, France.
- Cite (Informal):
- MorphTagger: HMM-based Arabic segmentation for statistical machine translation (Mansour, IWSLT 2010)
- PDF:
- https://preview.aclanthology.org/icon-24-ingestion/2010.iwslt-papers.15.pdf