Asefa Mebrahtu Abera

2026

GCCLA: Graph-Conditioned Cross-Lingual Adaptation of Large Language Models Under Extreme Data Scarcity (A Case Study in Tigrigna)
Hagos Gebremedhin Gebremeskel | Chong Feng | Asefa Mebrahtu Abera
Proceedings of the 4th Workshop on Cross-Cultural Considerations in NLP (C3NLP 2026)

Adapting large language models (LLMs) to extremely low-resource languages remains challenging due to severe data scarcity and the lack of structured linguistic supervision. We introduce GCCLA, a graph-conditioned cross-lingual adaptation framework that integrates multilingual knowledge graphs into parameter-efficient LLM adaptation. GCCLA conditions a frozen multilingual LLM on structured semantic and typological relations encoded in a multilingual graph, providing a strong inductive bias for data-efficient transfer. We instantiate and evaluate the framework through a focused case study on English-to-Amharic-to-Tigrinya transfer, where labeled data is extremely limited. By separating knowledge representation from language modeling, GCCLA stabilizes learning and improves sample efficiency in few-shot regimes. We evaluate the approach on five tasks, sentiment analysis, named entity recognition, natural language inference, question answering, and extractive summarization, under extreme data scarcity, with as few as 0–1000 labeled Tigrinya examples. Experimental results show that GCCLA consistently outperforms multilingual, translation-based, and parameter-efficient baselines, achieves competitive performance with as few as 100 labeled examples, and degrades gracefully under partial graph coverage. These findings demonstrate that graph conditioning is an effective principle for data-efficient cross-lingual adaptation of LLMs advancing equitable NLP.

Co-authors

Venues

C3NLP1
WS1

Fix author