A Survey of NLP Progress in Sino-Tibetan Low-Resource Languages

Shuheng Liu, Michael Best


Abstract
Despite the increasing effort in including more low-resource languages in NLP/CL development, most of the world’s languages are still absent. In this paper, we take the example of the Sino-Tibetan language family which consists of hundreds of low-resource languages, and we look at the representation of these low-resource languages in papers archived on ACL Anthology. Our findings indicate that while more techniques and discussions on more languages are present in more publication venues over the years, the overall focus on this language family has been minimal. The lack of attention might be owing to the small number of native speakers and governmental support of these languages. The current development of large language models, albeit successful in a few quintessential rich-resource languages, are still trailing when tackling these low-resource languages. Our paper calls for the attention in NLP/CL research on the inclusion of low-resource languages, especially as increasing resources are poured into the development of data-driven language models.
Anthology ID:
2025.naacl-long.396
Volume:
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Month:
April
Year:
2025
Address:
Albuquerque, New Mexico
Editors:
Luis Chiruzzo, Alan Ritter, Lu Wang
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7804–7825
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.naacl-long.396/
DOI:
Bibkey:
Cite (ACL):
Shuheng Liu and Michael Best. 2025. A Survey of NLP Progress in Sino-Tibetan Low-Resource Languages. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 7804–7825, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):
A Survey of NLP Progress in Sino-Tibetan Low-Resource Languages (Liu & Best, NAACL 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.naacl-long.396.pdf