Mansouri Rad

2006

In this paper we describe a set of processes for the acquisition of resources for quick rampup machine translation (MT) from any language lacking significant machine tractable resources into English, using the Paraguayan indigenous language Guarani as well as Amharic and Chechen, as examples. Our task is to develop a 250,000 monolingual corpus, a 250,000 bilingual parallel corpus, and smaller corpora tagged with part of speech, named entity, and morphological annotations.

Co-authors

Ahmed Abdelali 1
James Cowie 1
Steve Helmreich 1
Wanying Jin 1
Maria Pilar Milagros 1

Bill Ogden 1

Ron Zacharski 1

Venues

amta1

Fix author