Abstract
Morphological analysis is a fundamental task in natural-language processing, which is used in other NLP applications such as part-of-speech tagging, syntactic parsing, information retrieval, machine translation, etc. In this paper, we present our work on the development of free/open-source finite-state morphological analyser for Sindhi. We have used Apertium’s lttoolbox as our finite-state toolkit to implement the transducer. The system is developed using a paradigm-based approach, wherein a paradigm defines all the word forms and their morphological features for a given stem (lemma). We have evaluated our system on the Sindhi Wikipedia corpus and achieved a reasonable coverage of 81% and a precision of over 97%.- Anthology ID:
- L16-1409
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 2572–2577
- Language:
- URL:
- https://aclanthology.org/L16-1409
- DOI:
- Cite (ACL):
- Raveesh Motlani, Francis Tyers, and Dipti Sharma. 2016. A Finite-State Morphological Analyser for Sindhi. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 2572–2577, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- A Finite-State Morphological Analyser for Sindhi (Motlani et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/L16-1409.pdf