Alberto Samaniego
2014
Guampa: a Toolkit for Collaborative Translation
Alex Rudnick
|
Taylor Skidmore
|
Alberto Samaniego
|
Michael Gasser
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Here we present Guampa, a new software package for online collaborative translation. This system grows out of our discussions with Guarani-language activists and educators in Paraguay, and attempts to address problems faced by machine translation researchers and by members of any community speaking an under-represented language. Guampa enables volunteers and students to work together to translate documents into heritage languages, both to make more materials available in those languages, and also to generate bitext suitable for training machine translation systems. While many approaches to crowdsourcing bitext corpora focus on Mechanical Turk and temporarily engaging anonymous workers, Guampa is intended to foster an online community in which discussions can take place, language learners can practice their translation skills, and complete documents can be translated. This approach is appropriate for the Spanish-Guarani language pair as there are many speakers of both languages, and Guarani has a dedicated activist community. Our goal is to make it easy for anyone to set up their own instance of Guampa and populate it with documents -- such as automatically imported Wikipedia articles -- to be translated for their particular language pair. Guampa is freely available and relatively easy to use.