Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning

Manaal Faruqui, Ryan McDonald, Radu Soricut


Abstract
Morpho-syntactic lexicons provide information about the morphological and syntactic roles of words in a language. Such lexicons are not available for all languages and even when available, their coverage can be limited. We present a graph-based semi-supervised learning method that uses the morphological, syntactic and semantic relations between words to automatically construct wide coverage lexicons from small seed sets. Our method is language-independent, and we show that we can expand a 1000 word seed lexicon to more than 100 times its size with high quality for 11 languages. In addition, the automatically created lexicons provide features that improve performance in two downstream tasks: morphological tagging and dependency parsing.
Anthology ID:
Q16-1001
Volume:
Transactions of the Association for Computational Linguistics, Volume 4
Month:
Year:
2016
Address:
Cambridge, MA
Editors:
Lillian Lee, Mark Johnson, Kristina Toutanova
Venue:
TACL
SIG:
Publisher:
MIT Press
Note:
Pages:
1–16
Language:
URL:
https://aclanthology.org/Q16-1001
DOI:
10.1162/tacl_a_00079
Bibkey:
Cite (ACL):
Manaal Faruqui, Ryan McDonald, and Radu Soricut. 2016. Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning. Transactions of the Association for Computational Linguistics, 4:1–16.
Cite (Informal):
Morpho-syntactic Lexicon Generation Using Graph-based Semi-supervised Learning (Faruqui et al., TACL 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/Q16-1001.pdf