Abstract
We describe a method to automatically extract a German lexicon from Wiktionary that is compatible with the finite-state morphological grammar SMOR. The main advantage of the resulting lexicon over existing lexica for SMOR is that it is open and permissively licensed. A recall-oriented evaluation shows that a morphological analyser built with our lexicon has comparable coverage compared to existing lexica, and continues to improve as Wiktionary grows. We also describe modifications to the SMOR grammar that result in a more conventional lemmatisation of words.- Anthology ID:
- L14-1114
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1063–1067
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/116_Paper.pdf
- DOI:
- Cite (ACL):
- Rico Sennrich and Beat Kunz. 2014. Zmorge: A German Morphological Lexicon Extracted from Wiktionary. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 1063–1067, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Zmorge: A German Morphological Lexicon Extracted from Wiktionary (Sennrich & Kunz, LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/116_Paper.pdf