Nabiha Asghar


2025

pdf bib
Concept Distillation from Strong to Weak Models via Hypotheses-to-Theories Prompting
Emmanuel Aboah Boateng | Cassiano O Becker | Nabiha Asghar | Kabir Walia | Ashwin Srinivasan | Ehi Nosakhare | Soundararajan Srinivasan | Victor Dibia
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)

Hand-crafting high quality prompts to optimize the performance of language models is a complicated and labor-intensive process. Furthermore, when migrating to newer, smaller, or weaker models (possibly due to latency or cost gains), prompts need to be updated to re-optimize the task performance. We propose Concept Distillation (CD), an automatic prompt optimization technique for enhancing weaker models on complex tasks. CD involves: (1) collecting mistakes made by weak models with a base prompt (initialization), (2) using a strong model to generate reasons for these mistakes and create rules/concepts for weak models (induction), and (3) filtering these rules based on validation set performance and integrating them into the base prompt (deduction/verification). We evaluated CD on NL2Code and mathematical reasoning tasks, observing significant performance boosts for small and weaker language models. Notably, Mistral-7B’s accuracy on Multi-Arith increased by 20%, and Phi-3-mini-3.8B’s accuracy on HumanEval rose by 34%. Compared to other automated methods, CD offers an effective, cost-efficient strategy for improving weak models’ performance on complex tasks and enables seamless workload migration across different language models without compromising performance.

2017

pdf bib
Deep Active Learning for Dialogue Generation
Nabiha Asghar | Pascal Poupart | Xin Jiang | Hang Li
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics (*SEM 2017)

We propose an online, end-to-end, neural generative conversational model for open-domain dialogue. It is trained using a unique combination of offline two-phase supervised learning and online human-in-the-loop active learning. While most existing research proposes offline supervision or hand-crafted reward functions for online reinforcement, we devise a novel interactive learning mechanism based on hamming-diverse beam search for response generation and one-character user-feedback at each step. Experiments show that our model inherently promotes the generation of semantically relevant and interesting responses, and can be used to train agents with customized personas, moods and conversational styles.