Hany Fazzaa


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Capturing Intra-Dialectal Variation in Qatari Arabic: A Corpus of Cultural and Gender Dimensions
Houda Bouamor | Sara Al-Emadi | Zeinab Ibrahim | Hany Fazzaa | Aisha Al-Sultan
Proceedings of The Third Arabic Natural Language Processing Conference

We present the first publicly available, multidimensional corpus of Qatari Arabic that captures intra-dialectal variation across Urban and Bedouin speakers. While often grouped under the label of “Gulf Arabic”, Qatari Arabic exhibits rich phonological, lexical, and discourse-level differences shaped by gender, age, and sociocultural identity. Our dataset includes aligned speech and transcriptions from 255 speakers, stratified by gender and age, and collected through structured interviews on culturally salient topics such as education, heritage, and social norms. The corpus reveals systematic variation in pronunciation, vocabulary, and narrative style, offering insights for both sociolinguistic analysis and computational modeling. We also demonstrate its utility through preliminary experiments in the prediction of dialects and genders. This work provides the first large-scale, demographically balanced corpus of Qatari Arabic, laying a foundation for both sociolinguistic research and the development of dialect-aware NLP systems.