Pivot-based triangulation for low-resource languages

Rohit Dholakia, Anoop Sarkar


Abstract
This paper conducts a comprehensive study on the use of triangulation for four very low-resource languages: Mawukakan and Maninkakan, Haitian Kreyol and Malagasy. To the best of our knowledge, ours is the first effective translation system for the first two of these languages. We improve translation quality by adding data using pivot languages and exper- imentally compare previously proposed triangulation design options. Furthermore, since the low-resource language pair and pivot language pair data typically come from very different domains, we use insights from domain adaptation to tune the weighted mixture of direct and pivot based phrase pairs to improve translation quality.
Anthology ID:
2014.amta-researchers.24
Volume:
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track
Month:
October 22-26
Year:
2014
Address:
Vancouver, Canada
Venue:
AMTA
SIG:
Publisher:
Association for Machine Translation in the Americas
Note:
Pages:
315–328
Language:
URL:
https://aclanthology.org/2014.amta-researchers.24
DOI:
Bibkey:
Cite (ACL):
Rohit Dholakia and Anoop Sarkar. 2014. Pivot-based triangulation for low-resource languages. In Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, pages 315–328, Vancouver, Canada. Association for Machine Translation in the Americas.
Cite (Informal):
Pivot-based triangulation for low-resource languages (Dholakia & Sarkar, AMTA 2014)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2014.amta-researchers.24.pdf