Abid Hossain

2026

Scientific peer reviews frequently contain conflicting expert judgments, and the increasing scale of conference submissions makes it challenging for Area Chairs and editors to reliably identify and interpret such disagreements. Existing approaches typically frame reviewer disagreement as binary contradiction detection over isolated sentence pairs, abstracting away the review-level context and obscuring differences in the severity of evaluative conflict. In this work, we introduce a fine-grained formulation of reviewer contradiction analysis that operates over full peer reviews by explicitly identifying contradiction evidence spans and assigning graded disagreement intensity scores. To support this task, we present RevCI, an expert-annotated benchmark of peer-review pairs with evidence-level contradiction annotations with graded intensity labels. We further propose IMPACT, a structured multi-agent framework that integrates aspect-conditioned evidence extraction, deliberative reasoning, and adjudication to model reviewer contradictions and their intensity. To support efficient deployment, we distill IMPACT into TIDE, a small language model that predicts contradiction evidence and intensity in a single forward pass. Experimental results show that IMPACT substantially outperforms strong single-agent and generic multi-agent baselines in both evidence identification and intensity agreement, while TIDE achieves competitive performance at significantly lower inference cost.

pdf bib abs

Failure at BEA 2026 Shared Task 1: One Pipeline, Three L1s: A Unified Language-Agnostic System for Vocabulary Difficulty Prediction
Abid Hossain | Kamruzzaman Khan Alve
Proceedings of the 21st Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2026)

We present a unified, language-agnostic system for the BEA 2026 Shared Task on vocabulary difficulty prediction. The system uses a single training pipeline across Spanish, German, and Mandarin Chinese without any language-specific adaptation. Input features include serialized text fields and four scalar length-based features, processed using an XLM-RoBERTa encoder with attention-mask-weighted mean pooling. Hyperparameters are tuned with Optuna under reduced cross-validation, followed by full 5-fold training and checkpoint-based ensembling.Our approach improves over the official closed-track baseline across all three L1 conditions, demonstrating that a shared architecture and training strategy can yield consistent gains without language-specific engineering. Error analysis shows higher prediction error at difficulty extremes, suggesting a regression-to-the-mean tendency.

2025

pdf bib abs

MENDER: Multi-hop Commonsense and Domain-specific CoT Reasoning for Knowledge-grounded Empathetic Counseling of Crime Victims
Abid Hossain | Priyanshu Priya | Armita Mani Tripathi | Pradeepika Verma | Asif Ekbal
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)

Commonsense inference and domain-specific expertise are crucial for understanding and responding to emotional, cognitive, and topic-specific cues in counseling conversations with crime victims. However, these key evidences are often dispersed across multiple utterances, making it difficult to capture through single-hop reasoning. To address this, we propose MENDER, a novel Multi-hop commonsensE and domaiN-specific Chain-of-Thought (CoT) reasoning framework for knowleDge-grounded empathEtic Response generation in counseling dialogues. MENDER leverages large language models (LLMs) to integrate commonsense and domain knowledge via multi-hop reasoning over the dialogue context. It employs two specialized reasoning chains, viz. Commonsense Knowledge-driven CoT and Domain Knowledge-driven CoT rationales, which extract and aggregate dispersed emotional, cognitive, and topical evidences to generate knowledge-grounded empathetic counseling responses. Experimental evaluations on counseling dialogue dataset, POEM validate MENDER’s efficacy in generating coherent, empathetic, knowledge-grounded responses.

Co-authors

Priyanshu Priya 1

Tanik Saikh 1

Armita Mani Tripathi 1

Pradeepika Verma 1

Venues

Fix author