Minh Phuc Nguyen


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Fostering Digital Inclusion for Low-Resource Nigerian Languages: A Case Study of Igbo and Nigerian Pidgin
Ebelechukwu Nwafor | Minh Phuc Nguyen
Proceedings of the Eighth Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2025)

Current state-of-the-art large language models (LLMs) like GPT-4 perform exceptionally well in language translation tasks for high-resource languages, such as English, but often lack high accuracy results for low-resource African languages such as Igbo and Nigerian Pidgin, two native languages in Nigeria. This study addresses the need for Artificial Intelligence (AI) linguistic diversity by creating benchmark datasets for Igbo-English and Nigerian Pidgin-English language translation tasks. The dataset developed is curated from reputable online sources and meticulously annotated by crowd-sourced native-speaking human annotators. Using the datasets, we evaluate the translation abilities of GPT-based models alongside other state-of-the-art translation models specifically designed for low-resource languages. Our results demonstrate that current state-of-the-art models outperform GPT-based models in translation tasks. In addition, these datasets can significantly enhance LLM performance in these translation tasks, marking a step toward reducing linguistic bias and promoting more inclusive AI models.