Non-Contextual BERT or FastText? A Comparative Analysis
Abhay Shanbhag, Suramya Jadhav, Amogh Thakurdesai, Ridhima Bhaskar Sinare, Raviraj Joshi
Abstract
Natural Language Processing (NLP) for low-resource languages, which lack large annotated datasets, faces significant challenges due to limited high-quality data and linguistic resources. The selection of embeddings plays a critical role in achieving strong performance in NLP tasks. While contextual BERT embeddings require a full forward pass, non-contextual BERT embeddings rely only on table lookup. Existing research has primarily focused on contextual BERT embeddings, leaving non-contextual embeddings largely unexplored. In this study, we analyze the effectiveness of non-contextual embeddings from BERT models (MuRIL and MahaBERT) and FastText models (IndicFT and MahaFT) for tasks such as news classification, sentiment analysis, and hate speech detection in one such low-resource language—Marathi. We compare these embeddings with their contextual and compressed variants. Our findings indicate that non-contextual BERT embeddings extracted from the model’s first embedding layer outperform FastText embeddings, presenting a promising alternative for low-resource NLP.- Anthology ID:
- 2025.globalnlp-1.4
- Volume:
- Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models
- Month:
- September
- Year:
- 2025
- Address:
- Varna, Bulgaria
- Editors:
- Sudhansu Bala Das, Pruthwik Mishra, Alok Singh, Shamsuddeen Hassan Muhammad, Asif Ekbal, Uday Kumar Das
- Venues:
- GlobalNLP | WS
- SIG:
- Publisher:
- INCOMA Ltd., Shoumen, BULGARIA
- Note:
- Pages:
- 27–33
- Language:
- URL:
- https://preview.aclanthology.org/corrections-2026-01/2025.globalnlp-1.4/
- DOI:
- Cite (ACL):
- Abhay Shanbhag, Suramya Jadhav, Amogh Thakurdesai, Ridhima Bhaskar Sinare, and Raviraj Joshi. 2025. Non-Contextual BERT or FastText? A Comparative Analysis. In Proceedings of the Workshop on Beyond English: Natural Language Processing for all Languages in an Era of Large Language Models, pages 27–33, Varna, Bulgaria. INCOMA Ltd., Shoumen, BULGARIA.
- Cite (Informal):
- Non-Contextual BERT or FastText? A Comparative Analysis (Shanbhag et al., GlobalNLP 2025)
- PDF:
- https://preview.aclanthology.org/corrections-2026-01/2025.globalnlp-1.4.pdf