Abstract
SYSTRAN started the design and the development of Arabic, Farsi and Urdu to English machine translation systems in July 2002. This paper describes the methodology and implementation adopted for dictionary building and morphological analysis. SYSTRAN’s IntuitiveCoding® technology (ICT) for facilitates the creation, update, and maintenance of Arabic, Farsi and Urdu lexical entries, is more modular and less costly. ICT for Arabic, Farsi, and Urdu requires the implementation of stem-based lexical entries, the authentic scripts for each language, a statistical Arabic stem-guesser, and separate declarative modules for internal and external morphology.- Anthology ID:
- 2003.mtsummit-semit.6
- Volume:
- Workshop on Machine Translation for Semitic languages: issues and approaches
- Month:
- September 23-27
- Year:
- 2003
- Address:
- New Orleans, USA
- Venue:
- MTSummit
- SIG:
- Publisher:
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2003.mtsummit-semit.6
- DOI:
- Cite (ACL):
- Ali Farghaly and Jean Senellart. 2003. Inductive coding of the Arabic lexicon. In Workshop on Machine Translation for Semitic languages: issues and approaches, New Orleans, USA.
- Cite (Informal):
- Inductive coding of the Arabic lexicon (Farghaly & Senellart, MTSummit 2003)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2003.mtsummit-semit.6.pdf