Steven Limcorn


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
NusaBERT: Teaching IndoBERT to be Multilingual and Multicultural
Wilson Wongso | David Samuel Setiawan | Steven Limcorn | Ananto Joyoadikusumo
Proceedings of the Second Workshop in South East Asian Language Processing

We present NusaBERT, a multilingual model built on IndoBERT and tailored for Indonesia’s diverse languages. By expanding vocabulary and pre-training on a regional corpus, NusaBERT achieves state-of-the-art performance on Indonesian NLU benchmarks, enhancing IndoBERT’s multilingual capability. This study also addresses NusaBERT’s limitations and encourages further research on Indonesia’s underrepresented languages.

pdf bib
Lazarus NLP at SemEval-2025 Task 11: Fine-Tuning Large Language Models for Multi-Label Emotion Classification via Sentence-Label Pairing
Wilson Wongso | David Setiawan | Ananto Joyoadikusumo | Steven Limcorn
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Multi-label emotion classification in low-resource languages remains challenging due to limited annotated data and model adaptability. To address this, we fine-tune large language models (LLMs) using a sentence-label pairing approach, optimizing efficiency while improving classification performance. Evaluating on Sundanese, Indonesian, and Javanese, our method outperforms conventional classifier-based fine-tuning and achieves strong zero-shot cross-lingual transfer. Notably, our approach ranks first in the Sundanese subset of SemEval-2025 Task 11 Track A. Our findings demonstrate the effectiveness of LLM fine-tuning for low-resource emotion classification, underscoring the importance of tailoring adaptation strategies to specific language families in multilingual contexts.