Abstract
The promise of Large Language Models (LLMs) in Natural Language Processing has often been overshadowed by their limited performance in low-resource languages such as Bangla. To address this, our paper presents a pioneering approach that utilizes cross-lingual retrieval augmented in-context learning. By strategically sourcing semantically similar prompts from high-resource language, we enable multilingual pretrained language models (MPLMs), especially the generative model BLOOMZ, to successfully boost performance on Bangla tasks. Our extensive evaluation highlights that the cross-lingual retrieval augmented prompts bring steady improvements to MPLMs over the zero-shot performance.- Anthology ID:
- 2023.banglalp-1.15
- Volume:
- Proceedings of the First Workshop on Bangla Language Processing (BLP-2023)
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Firoj Alam, Sudipta Kar, Shammur Absar Chowdhury, Farig Sadeque, Ruhul Amin
- Venue:
- BanglaLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 136–151
- Language:
- URL:
- https://aclanthology.org/2023.banglalp-1.15
- DOI:
- 10.18653/v1/2023.banglalp-1.15
- Cite (ACL):
- Xiaoqian Li, Ercong Nie, and Sheng Liang. 2023. Crosslingual Retrieval Augmented In-context Learning for Bangla. In Proceedings of the First Workshop on Bangla Language Processing (BLP-2023), pages 136–151, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Crosslingual Retrieval Augmented In-context Learning for Bangla (Li et al., BanglaLP 2023)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2023.banglalp-1.15.pdf