Valérie Hanoka
2014
An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora
Valérie Hanoka
|
Benoît Sagot
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
This paper describes YaMTG (Yet another Multilingual Translation Graph), a new open-source heavily multilingual translation database (over 664 languages represented) built using several sources, namely various wiktionaries and the OPUS parallel corpora (Tiedemann, 2009). We detail the translation extraction process for 21 wiktionary language editions, and provide an evaluation of the translations contained in YaMTG.
2012
Wordnet extension made simple: A multilingual lexicon-based approach using wiki resources
Valérie Hanoka
|
Benoît Sagot
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
In this paper, we propose a simple methodology for building or extending wordnets using easily extractible lexical knowledge from Wiktionary and Wikipedia. This method relies on a large multilingual translation/synonym graph in many languages as well as synset-aligned wordnets. It guesses frequent and polysemous literals that are difficult to find using other methods by looking at back-translations in the graph, showing that the use of a heavily multilingual lexicon can be a way to mitigate the lack of wide coverage bilingual lexicon for wordnet creation or extension. We evaluate our approach on French by applying it for extending WOLF, a freely available French wordnet.