Towards Low-resource Language Generation with Limited Supervision

Kaushal Maurya, Maunendra Desarkar


Abstract
We present a research narrative aimed at enabling language technology for multiple natural language generation (NLG) tasks in low-resource languages (LRLs). With approximately 7,000 languages spoken globally, many lack the resources required for model training. NLG applications for LRLs present two additional key challenges: (i) The training is more pronounced, and (ii) Zero-shot modeling is a viable research direction for scalability; however, generating zero-shot well-formed text in target LRLs is challenging. Addressing these concerns, this narrative introduces three promising research explorations that serve as a step toward enabling language technology for many LRLs. These approaches make effective use of transfer learning and limited supervision techniques for modeling. Evaluations were conducted mostly in the zero-shot setting, enabling scalability. This research narrative is an ongoing doctoral thesis.
Anthology ID:
2023.bigpicture-1.7
Volume:
Proceedings of the Big Picture Workshop
Month:
December
Year:
2023
Address:
Singapore
Editors:
Yanai Elazar, Allyson Ettinger, Nora Kassner, Sebastian Ruder, Noah A. Smith
Venue:
BigPicture
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
80–92
Language:
URL:
https://aclanthology.org/2023.bigpicture-1.7
DOI:
10.18653/v1/2023.bigpicture-1.7
Bibkey:
Cite (ACL):
Kaushal Maurya and Maunendra Desarkar. 2023. Towards Low-resource Language Generation with Limited Supervision. In Proceedings of the Big Picture Workshop, pages 80–92, Singapore. Association for Computational Linguistics.
Cite (Informal):
Towards Low-resource Language Generation with Limited Supervision (Maurya & Desarkar, BigPicture 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2023.bigpicture-1.7.pdf