Miftahul Mahfuzh


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
NusaDialogue: Dialogue Summarization and Generation for Underrepresented and Extremely Low-Resource Languages
Ayu Purwarianti | Dea Adhista | Agung Baptiso | Miftahul Mahfuzh | Yusrina Sabila | Aulia Adila | Samuel Cahyawijaya | Alham Fikri Aji
Proceedings of the Second Workshop in South East Asian Language Processing

Developing dialogue summarization for extremely low-resource languages is a challenging task. We introduce NusaDialogue, a dialogue summarization dataset for three underrepresented languages in the Malayo-Polynesian language family: Minangkabau, Balinese, and Buginese. NusaDialogue covers 17 topics and 185 subtopics, with annotations provided by 73 native speakers. Additionally, we conducted experiments using fine-tuning on a specifically designed medium-sized language model for Indonesian, as well as zero- and few-shot learning on various multilingual large language models (LLMs). The results indicate that, for extremely low-resource languages such as Minangkabau, Balinese, and Buginese, the fine-tuning approach yields significantly higher performance compared to zero- and few-shot prompting, even when applied to LLMs with considerably larger parameter sizes.