Zmorge: A German Morphological Lexicon Extracted from Wiktionary

Rico Sennrich, Beat Kunz


Abstract
We describe a method to automatically extract a German lexicon from Wiktionary that is compatible with the finite-state morphological grammar SMOR. The main advantage of the resulting lexicon over existing lexica for SMOR is that it is open and permissively licensed. A recall-oriented evaluation shows that a morphological analyser built with our lexicon has comparable coverage compared to existing lexica, and continues to improve as Wiktionary grows. We also describe modifications to the SMOR grammar that result in a more conventional lemmatisation of words.
Anthology ID:
L14-1114
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1063–1067
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/116_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Rico Sennrich and Beat Kunz. 2014. Zmorge: A German Morphological Lexicon Extracted from Wiktionary. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1063–1067, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Zmorge: A German Morphological Lexicon Extracted from Wiktionary (Sennrich & Kunz, LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/116_Paper.pdf