Abstract
We present a method for translating texts between close language pairs. The method does not require parallel data, and it does not require the languages to be written in the same script. We show results for six language pairs: Afrikaans/Dutch, Bosnian/Serbian, Danish/Swedish, Macedonian/Bulgarian, Malaysian/Indonesian, and Polish/Belorussian. We report BLEU scores showing our method to outperform others that do not use parallel data.- Anthology ID:
- D17-1266
- Volume:
- Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
- Month:
- September
- Year:
- 2017
- Address:
- Copenhagen, Denmark
- Editors:
- Martha Palmer, Rebecca Hwa, Sebastian Riedel
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2513–2518
- Language:
- URL:
- https://aclanthology.org/D17-1266
- DOI:
- 10.18653/v1/D17-1266
- Cite (ACL):
- Nima Pourdamghani and Kevin Knight. 2017. Deciphering Related Languages. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 2513–2518, Copenhagen, Denmark. Association for Computational Linguistics.
- Cite (Informal):
- Deciphering Related Languages (Pourdamghani & Knight, EMNLP 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/D17-1266.pdf