Building a Functional Machine Translation Corpus for Kpelle

Kweku Andoh Yamoah; Jackson Weako; Emmanuel Dorley

Building a Functional Machine Translation Corpus for Kpelle

Kweku Andoh Yamoah, Jackson Weako, Emmanuel Dorley

Abstract

In this paper, we introduce the first publicly available English-Kpelle dataset for machine translation, comprising over 2,000 sentence pairs drawn from everyday communication, religious texts, and educational materials. By fine-tuning Metas No Language Left Behind (NLLB) model on two versions of the dataset, we achieved BLEU scores of up to 30 in the Kpelle-to-English direction, demonstrating the benefits of data augmentation. Our findings align with NLLB-200 benchmarks on other African languages, underscoring Kpelles potential for competitive performance despite its low-resource status. Beyond machine translation, this dataset enables broader NLP tasks, including speech recognition and language modeling. We conclude with a roadmap for future dataset expansion, emphasizing orthographic consistency, community-driven validation, and interdisciplinary collaboration to advance inclusive language technology development for Kpelle and other low-resourced Mande languages.

Anthology ID:: 2025.africanlp-1.8
Volume:: Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Constantine Lignos, Idris Abdulmumin, David Adelani
Venues:: AfricaNLP | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 52–63
Language:
URL:: https://preview.aclanthology.org/display_plenaries/2025.africanlp-1.8/
DOI:
Bibkey:
Cite (ACL):: Kweku Andoh Yamoah, Jackson Weako, and Emmanuel Dorley. 2025. Building a Functional Machine Translation Corpus for Kpelle. In Proceedings of the Sixth Workshop on African Natural Language Processing (AfricaNLP 2025), pages 52–63, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Building a Functional Machine Translation Corpus for Kpelle (Yamoah et al., AfricaNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/display_plenaries/2025.africanlp-1.8.pdf

PDF Cite Search Fix data