Abstract
The task of unsupervised lexicon induction is to find translation pairs across monolingual corpora. We develop a novel method that creates seed lexicons by identifying cognates in the vocabularies of related languages on the basis of their frequency and lexical similarity. We apply bidirectional bootstrapping to a method which learns a linear mapping between context-based vector spaces. Experimental results on three language pairs show consistent improvement over prior work.- Anthology ID:
- E17-2098
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 619–624
- Language:
- URL:
- https://aclanthology.org/E17-2098
- DOI:
- Cite (ACL):
- Bradley Hauer, Garrett Nicolai, and Grzegorz Kondrak. 2017. Bootstrapping Unsupervised Bilingual Lexicon Induction. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 619–624, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Bootstrapping Unsupervised Bilingual Lexicon Induction (Hauer et al., EACL 2017)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/E17-2098.pdf