Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world
Sunayana Sitaram, Monojit Choudhury, Barun Patra, Vishrav Chaudhary, Kabir Ahuja, Kalika Bali
Abstract
This tutorial will describe various aspects of scaling up language technologies to many of the world’s languages by describing the latest research in Massively Multilingual Language Models (MMLMs). We will cover topics such as data collection, training and fine-tuning of models, Responsible AI issues such as fairness, bias and toxicity, linguistic diversity and evaluation in the context of MMLMs, specifically focusing on issues in non-English and low-resource languages. Further, we will also talk about some of the real-world challenges in deploying these models in language communities in the field. With the performance of MMLMs improving in the zero-shot setting for many languages, it is now becoming feasible to use them for building language technologies in many languages of the world, and this tutorial will provide the computational linguistics community with unique insights from the latest research in multilingual models.- Anthology ID:
- 2023.acl-tutorials.3
- Volume:
- Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts)
- Month:
- July
- Year:
- 2023
- Address:
- Toronto, Canada
- Editors:
- Yun-Nung (Vivian) Chen, Margot Margot, Siva Reddy
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 21–26
- Language:
- URL:
- https://aclanthology.org/2023.acl-tutorials.3
- DOI:
- 10.18653/v1/2023.acl-tutorials.3
- Cite (ACL):
- Sunayana Sitaram, Monojit Choudhury, Barun Patra, Vishrav Chaudhary, Kabir Ahuja, and Kalika Bali. 2023. Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts), pages 21–26, Toronto, Canada. Association for Computational Linguistics.
- Cite (Informal):
- Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world (Sitaram et al., ACL 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2023.acl-tutorials.3.pdf