Abstract
We describe a morphological analyzer for the Swahili language, written in an extension of XFST/LEXC intended for the easy declaration of morphophonological patterns and importation of lexical resources. Our analyzer was supplemented extensively with data from the Kamusi Project (kamusi.org), a user-contributed multilingual dictionary. Making use of this resource allowed us to achieve wide lexical coverage quickly, but the heterogeneous nature of user-contributed content also poses some challenges when adapting it for use in an expert system.- Anthology ID:
- L14-1686
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3333–3339
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/896_Paper.pdf
- DOI:
- Cite (ACL):
- Patrick Littell, Kaitlyn Price, and Lori Levin. 2014. Morphological parsing of Swahili using crowdsourced lexical resources. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3333–3339, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Morphological parsing of Swahili using crowdsourced lexical resources (Littell et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/896_Paper.pdf