Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world

Sunayana Sitaram; Monojit Choudhury; Barun Patra; Vishrav Chaudhary; Kabir Ahuja; Kalika Bali

doi:10.18653/v1/2023.acl-tutorials.3

Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world

Sunayana Sitaram, Monojit Choudhury, Barun Patra, Vishrav Chaudhary, Kabir Ahuja, Kalika Bali

Abstract

This tutorial will describe various aspects of scaling up language technologies to many of the world’s languages by describing the latest research in Massively Multilingual Language Models (MMLMs). We will cover topics such as data collection, training and fine-tuning of models, Responsible AI issues such as fairness, bias and toxicity, linguistic diversity and evaluation in the context of MMLMs, specifically focusing on issues in non-English and low-resource languages. Further, we will also talk about some of the real-world challenges in deploying these models in language communities in the field. With the performance of MMLMs improving in the zero-shot setting for many languages, it is now becoming feasible to use them for building language technologies in many languages of the world, and this tutorial will provide the computational linguistics community with unique insights from the latest research in multilingual models.

Anthology ID:: 2023.acl-tutorials.3
Volume:: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts)
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Yun-Nung (Vivian) Chen, Margot Margot, Siva Reddy
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 21–26
Language:
URL:: https://aclanthology.org/2023.acl-tutorials.3
DOI:: 10.18653/v1/2023.acl-tutorials.3
Bibkey:
Cite (ACL):: Sunayana Sitaram, Monojit Choudhury, Barun Patra, Vishrav Chaudhary, Kabir Ahuja, and Kalika Bali. 2023. Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 6: Tutorial Abstracts), pages 21–26, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Everything you need to know about Multilingual LLMs: Towards fair, performant and reliable models for languages of the world (Sitaram et al., ACL 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-1/2023.acl-tutorials.3.pdf

PDF Search