Amin Hassanpour


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
PARME: Parallel Corpora for Low-Resourced Middle Eastern Languages
Sina Ahmadi | Rico Sennrich | Erfan Karami | Ako Marani | Parviz Fekrazad | Gholamreza Akbarzadeh Baghban | Hanah Hadi | Semko Heidari | Mahîr Dogan | Pedram Asadi | Dashne Bashir | Mohammad Amin Ghodrati | Kourosh Amini | Zeynab Ashourinezhad | Mana Baladi | Farshid Ezzati | Alireza Ghasemifar | Daryoush Hosseinpour | Behrooz Abbaszadeh | Amin Hassanpour | Bahaddin Jalal Hamaamin | Saya Kamal Hama | Ardeshir Mousavi | Sarko Nazir Hussein | Isar Nejadgholi | Mehmet Ölmez | Horam Osmanpour | Rashid Roshan Ramezani | Aryan Sediq Aziz | Ali Salehi Sheikhalikelayeh | Mohammadreza Yadegari | Kewyar Yadegari | Sedighe Zamani Roodsari
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The Middle East is characterized by remarkable linguistic diversity, with over 400 million inhabitants speaking more than 60 languages across multiple language families. This study presents a pioneering work in developing the first parallel corpora for eight severely under-resourced varieties in the region–PARME, addressing fundamental challenges in low-resource scenarios including non-standardized writing and dialectal complexity. Through an extensive community-driven initiative, volunteers contributed to the creation of over 36,000 translated sentences, marking a significant milestone in resource development. We evaluate machine translation capabilities through zero-shot approaches and fine-tuning experiments with pretrained machine translation models and provide a comprehensive analysis of limitations. Our findings reveal significant gaps in existing technologies for processing the selected languages, highlighting critical areas for improvement in language technology for Middle Eastern languages.