Thesis Proposal: Efficient Methods for Natural Language Generation/Understanding Systems

Nalin Kumar


Abstract
While Large Language Models (LLMs) have shown remarkable performance in various Natural Language Processing (NLP) tasks, their effectiveness seem to be heavily biased toward high-resource languages. This proposal aims to address this gap by developing efficient training strategies for low-resource languages. We propose various techniques for efficient learning in simluated low-resource settings for English. We then plan to adapt these methods for low-resource languages. We plan to experiment with both natural language generation and understanding models. We evaluate the models on similar benchmarks as the BabyLM challenge for English. For other languages, we plan to use treebanks and translation techniques to create our own silver test set to evaluate the low-resource LMs.
Anthology ID:
2025.ijcnlp-srw.18
Volume:
The 14th International Joint Conference on Natural Language Processing and The 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Santosh T.y.s.s, Shuichiro Shimizu, Yifan Gong
Venue:
IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
209–217
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-srw.18/
DOI:
Bibkey:
Cite (ACL):
Nalin Kumar. 2025. Thesis Proposal: Efficient Methods for Natural Language Generation/Understanding Systems. In The 14th International Joint Conference on Natural Language Processing and The 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, pages 209–217, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
Thesis Proposal: Efficient Methods for Natural Language Generation/Understanding Systems (Kumar, IJCNLP 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.ijcnlp-srw.18.pdf