Abstract
The paper describes a set of methods to automatically acquire the Urdu nouns (and its gender) on the basis of inflectional and contextual clues. The algorithms used are a blend of computer’s brute force on the corpus and careful design of distinguishing rules on the basis linguistic knowledge. As there are homograph inflections for Urdu nouns, adjectives and verbs, we compare potential inflectional forms with paradigms of inflections in strict order and gives best guess (of part of speech) for the word. We also worked on irregular plurals i.e. the plural forms that are borrowed from Arabic, Persian and English. Evaluation shows that not all the borrowed rules have same productivity in Urdu. The commonly used borrowed plural rules are shown in the result.- Anthology ID:
- L14-1650
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 2846–2850
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/844_Paper.pdf
- DOI:
- Cite (ACL):
- Tafseer Ahmed Khan. 2014. Automatic acquisition of Urdu nouns (along with gender and irregular plurals). In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 2846–2850, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Automatic acquisition of Urdu nouns (along with gender and irregular plurals) (Ahmed Khan, LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/844_Paper.pdf