Abstract
We present our systems for the three tasks and five languages included in the MRL 2024 Shared Task on Multilingual Multi-task Information Retrieval: (1) Named Entity Recognition, (2) Free-form Question Answering, and (3) Multiple-choice Question Answering. For each task, we explored the impact of selecting different multilingual language models for fine-tuning across various target languages, and implemented an ensemble system that generates final outputs based on predictions from multiple fine-tuned models. All models are large language models fine-tuned on task-specific data. Our experimental results show that a more balanced dataset would yield better results. However, when training data for certain languages are scarce, fine-tuning on a large amount of English data supplemented by a small amount of “triggering data” in the target language can produce decent results.- Anthology ID:
- 2024.mrl-1.28
- Volume:
- Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024)
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Jonne Sälevä, Abraham Owodunni
- Venue:
- MRL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 346–356
- Language:
- URL:
- https://aclanthology.org/2024.mrl-1.28
- DOI:
- 10.18653/v1/2024.mrl-1.28
- Cite (ACL):
- Senyu Li, Hao Yu, Jessica Ojo, and David Ifeoluwa Adelani. 2024. McGill NLP Group Submission to the MRL 2024 Shared Task: Ensembling Enhances Effectiveness of Multilingual Small LMs. In Proceedings of the Fourth Workshop on Multilingual Representation Learning (MRL 2024), pages 346–356, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- McGill NLP Group Submission to the MRL 2024 Shared Task: Ensembling Enhances Effectiveness of Multilingual Small LMs (Li et al., MRL 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.mrl-1.28.pdf