Maria Pilar Milagros


2006

pdf
Guarani: A Case Study in Resource Development for Quick Ramp-Up MT
Ahmed Abdelali | James Cowie | Steve Helmreich | Wanying Jin | Maria Pilar Milagros | Bill Ogden | Mansouri Rad | Ron Zacharski
Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers

In this paper we describe a set of processes for the acquisition of re­sources for quick ramp­up machine translation (MT) from any language lacking significant machine tracta­ble resources into English, using the Paraguayan indigenous lan­guage Guarani as well as Amharic and Chechen, as examples. Our task is to develop a 250,000 mono­lingual corpus, a 250,000 bilingual parallel corpus, and smaller corpora tagged with part of speech, named entity, and morphological annota­tions.