Abstract
This paper describes our first experiment on Neural Machine Translation (NMT) based query translation for Amharic-Arabic Cross-Language Information Retrieval (CLIR) task to retrieve relevant documents from Amharic and Arabic text collections in response to a query expressed in the Amharic language. We used a pre-trained NMT model to map a query in the source language into an equivalent query in the target language. The relevant documents are then retrieved using a Language Modeling (LM) based retrieval algorithm. Experiments are conducted on four conventional IR models, namely Uni-gram and Bi-gram LM, Probabilistic model, and Vector Space Model (VSM). The results obtained illustrate that the proposed Uni-gram LM outperforms all other models for both Amharic and Arabic language document collections.- Anthology ID:
- 2019.icon-1.7
- Volume:
- Proceedings of the 16th International Conference on Natural Language Processing
- Month:
- December
- Year:
- 2019
- Address:
- International Institute of Information Technology, Hyderabad, India
- Editors:
- Dipti Misra Sharma, Pushpak Bhattacharya
- Venue:
- ICON
- SIG:
- Publisher:
- NLP Association of India
- Note:
- Pages:
- 56–64
- Language:
- URL:
- https://preview.aclanthology.org/add_missing_videos/2019.icon-1.7/
- DOI:
- Cite (ACL):
- Ibrahim Gashaw and H.l Shashirekha. 2019. Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval. In Proceedings of the 16th International Conference on Natural Language Processing, pages 56–64, International Institute of Information Technology, Hyderabad, India. NLP Association of India.
- Cite (Informal):
- Language Modelling with NMT Query Translation for Amharic-Arabic Cross-Language Information Retrieval (Gashaw & Shashirekha, ICON 2019)
- PDF:
- https://preview.aclanthology.org/add_missing_videos/2019.icon-1.7.pdf