Abstract
Arabic has a very rich and complex morphology. Its appropriate morphological processing is very important for Information Retrieval (IR). In this paper, we propose a new stemming technique that tries to determine the stem of a word representing the semantic core of this word according to Arabic morphology. This method is compared to a commonly used light stemming technique which truncates a word by simple rules. Our tests on TREC collections show that the new stemming technique is more effective than the light stemming.- Anthology ID:
- 2006.bcs-1.6
- Volume:
- Proceedings of the International Conference on the Challenge of Arabic for NLP/MT
- Month:
- October 23
- Year:
- 2006
- Address:
- London, UK
- Venue:
- BCS
- SIG:
- Publisher:
- Note:
- Pages:
- 68–75
- Language:
- URL:
- https://aclanthology.org/2006.bcs-1.6
- DOI:
- Cite (ACL):
- Youssef Kadri and Jian-Yun Nie. 2006. Effective Stemming for Arabic Information Retrieval. In Proceedings of the International Conference on the Challenge of Arabic for NLP/MT, pages 68–75, London, UK.
- Cite (Informal):
- Effective Stemming for Arabic Information Retrieval (Kadri & Nie, BCS 2006)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/2006.bcs-1.6.pdf