Team Horizon at BHASHA Task 2: Fine-tuning Multilingual Transformers for Indic Word Grouping

Manav Dhamecha, Gaurav Damor, Sunil Jaat, Pruthwik Mishra


Abstract
We present Team Horizon’s approach to BHASHA Task 2: Indic Word Grouping. We model the word-grouping problem as token classification problem and fine-tune multilingual Transformer encoders for the task. We evaluated MuRIL, XLM-Roberta, and IndicBERT v2 and report Exact Match accuracy on the test data. Our best model (MuRIL) achieves 58.1818% exact match on the test set.
Anthology ID:
2025.bhasha-1.18
Volume:
Proceedings of the 1st Workshop on Benchmarks, Harmonization, Annotation, and Standardization for Human-Centric AI in Indian Languages (BHASHA 2025)
Month:
December
Year:
2025
Address:
Mumbai, India
Editors:
Arnab Bhattacharya, Pawan Goyal, Saptarshi Ghosh, Kripabandhu Ghosh
Venues:
BHASHA | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
175–179
Language:
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.bhasha-1.18/
DOI:
Bibkey:
Cite (ACL):
Manav Dhamecha, Gaurav Damor, Sunil Jaat, and Pruthwik Mishra. 2025. Team Horizon at BHASHA Task 2: Fine-tuning Multilingual Transformers for Indic Word Grouping. In Proceedings of the 1st Workshop on Benchmarks, Harmonization, Annotation, and Standardization for Human-Centric AI in Indian Languages (BHASHA 2025), pages 175–179, Mumbai, India. Association for Computational Linguistics.
Cite (Informal):
Team Horizon at BHASHA Task 2: Fine-tuning Multilingual Transformers for Indic Word Grouping (Dhamecha et al., BHASHA 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.bhasha-1.18.pdf