Ahmed Kirmani


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Sample Efficient Alignment Learning With Episodic Control
Van Dai Do | Quan Hung Tran | Ahmed Kirmani | Lu Zhang | Hung Le
Findings of the Association for Computational Linguistics: EMNLP 2025

Aligning large language models (LLMs) with specific task objectives is challenging, especially when access to feedback signals for guiding the model is limited. While existing parametric methods perform reasonably, they rely heavily on large datasets and frequent feedback, making them impractical in scenarios with limited human feedback. We introduce Alignment Learning with Episodic Control (ALEC), a non-parametric framework that aligns LLM outputs during inference without fine-tuning. ALEC employs a key-value memory to store the associations between generated text and its corresponding values. It leverages a novel confidence-based writing scheme to update these stored values, maximizing the use of available data. During inference, ALEC utilizes a nearest-neighbor mechanism to estimate the values of generated texts, enabling the selection of the optimal text for decoding. Our method outperforms state-of-the-art baselines on harmless, helpful, and summarization tasks, demonstrating improved alignment with minimal interactions with the true reward model.