Abstract
This paper describes YaMTG (Yet another Multilingual Translation Graph), a new open-source heavily multilingual translation database (over 664 languages represented) built using several sources, namely various wiktionaries and the OPUS parallel corpora (Tiedemann, 2009). We detail the translation extraction process for 21 wiktionary language editions, and provide an evaluation of the translations contained in YaMTG.- Anthology ID:
- L14-1616
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3179–3186
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/792_Paper.pdf
- DOI:
- Cite (ACL):
- Valérie Hanoka and Benoît Sagot. 2014. An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3179–3186, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora (Hanoka & Sagot, LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/792_Paper.pdf