Samreen Kazi

2026

QARI: Neural Architecture for Urdu Extractive Machine Reading Comprehension
Samreen Kazi | Shakeel Ahmed Khoja
Proceedings of the Second Workshop on Language Models for Low-Resource Languages (LoResLM 2026)

Urdu, a morphologically rich and low-resource language spoken by over 300 million people, poses unique challenges for extractive machine reading comprehension (EMRC), particularly in accurately identifying span boundaries involving postpositions and copulas. Existing multilingual models struggle with subword fragmentation and imprecise span extraction in such settings. We introduce QARI (قاری, “reader”), a character-enhanced architecture for Urdu extractive MRC that augments pretrained multilingual encoders with three innovations: (1) a character-level CNN that captures affix patterns and morphological features from full word forms; (2) a gated fusion mechanism that integrates semantic and morphological representations; and (3) a boundary-contrastive learning objective targeting Urdu-specific span errors. Evaluated on UQuAD+, the first native Urdu MRC benchmark, QARI achieves 83.5 F1, a 5.5 point improvement over the previous best result (mT5, 78.0 F1), setting a new state of the art. Ablations show that character-level modeling and boundary supervision contribute +7.5 and +7.0 F1, respectively. Cross-dataset evaluations on UQA and UrFQuAD confirm QARI’s robustness. Error analysis reveals significant reductions in boundary drift, with improvements most notable for short factual questions.

2025

pdf bib abs

Crossing Language Boundaries: Evaluation of Large Language Models on Urdu-English Question Answering
Samreen Kazi | Maria Rahim | Shakeel Ahmed Khoja
Proceedings of the First Workshop on Natural Language Processing for Indo-Aryan and Dravidian Languages

This study evaluates the question-answering capabilities of Large Language Models (LLMs) in Urdu, addressing a critical gap in low-resource language processing. Four models GPT-4, mBERT, XLM-R, and mT5 are assessed across monolingual, cross-lingual, and mixed-language settings using the UQuAD1.0 and SQuAD2.0 datasets. Results reveal significant performance gaps between English and Urdu processing, with GPT-4 achieving the highest F1 scores (89.1% in English, 76.4% in Urdu) while demonstrating relative robustness in cross-lingual scenarios. Boundary detection and translation mismatches emerge as primary challenges, particularly in cross-lingual settings. The study further demonstrates that question complexity and length significantly impact performance, with factoid questions yielding 14.2% higher F1 scores compared to complex questions. These findings establish important benchmarks for enhancing LLM performance in low-resource languages and identify key areas for improvement in multilingual question-answering systems.

2024

pdf bib

Context-Aware Question Answering in Urdu
Samreen Kazi | Shakeel Ahmed Khoja
Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024)

pdf bib

UPERF:Urdu Proximity Enhanced Retrieval Framework
Samreen Kazi | Shakeel Khoja
Proceedings of the 38th Pacific Asia Conference on Language, Information and Computation

Co-authors

Venues

Fix author