Ahnaf Mozib Samin

Also published as: Ahnaf Mozib Samin


2025

pdf bib
ColorFoil: Investigating Color Blindness in Large Vision and Language Models
Ahnaf Mozib Samin | M Firoz Ahmed | Md. Mushtaq Shahriyar Rafee
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)

With the utilization of Transformer architecture, large Vision and Language (V&L) models have shown promising performance in even zero-shot settings. Several studies, however, indicate a lack of robustness of the models when dealing with complex linguistics and visual attributes. In this work, we introduce a novel V&L benchmark - ColorFoil, by creating color-related foils to assess the models’ perception ability to detect colors like red, white, green, etc. We evaluate seven state-of-the-art V&L models including CLIP, ViLT, GroupViT, and BridgeTower, etc. in a zero-shot setting and present intriguing findings from the V&L models. The experimental evaluation indicates that ViLT and BridgeTower demonstrate much better color perception capabilities compared to CLIP and its variants and GroupViT. Moreover, CLIP-based models and GroupViT struggle to distinguish colors that are visually distinct to humans with normal color perception ability.

pdf bib
Investigating Adapters for Parameter-efficient Low-resource Automatic Speech Recognition
Ahnaf Mozib Samin | Shekhar Nayak | Andrea De Marco | Claudia Borg
Proceedings of the 10th Workshop on Representation Learning for NLP (RepL4NLP-2025)

Recent years have witnessed the adoption of parameter-efficient adapters in pre-trained language models for natural language processing. Yet, their application in speech processing remains less studied. In this work, we explore the adapters for low-resource speech recognition, introducing a novel technique - ConvAdapt into pre-trained speech models. We investigate various aspects such as data requirements, transfer learning within adapters, and scaling of feed-forward layers in adapters. Our findings reveal that bottleneck adapters offer competitiveness with full fine-tuning with at least 10 hours of data, but they are not as effective in few-shot learning scenarios. Notably, ConvAdapt demonstrates improved performance in such cases. In addition, transfer learning in adapters shows promise, necessitating research in related languages. Furthermore, employing larger speech models for adapter-tuning surpasses fine-tuning with ample data, potentially due to reduced overfitting than fine-tuning.

2024

pdf bib
BD-NLP at SemEval-2024 Task 2: Investigating Generative and Discriminative Models for Clinical Inference with Knowledge Augmentation
Shantanu Nath | Ahnaf Mozib Samin
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)

Healthcare professionals rely on evidence from clinical trial records (CTRs) to devise treatment plans. However, the increasing quantity of CTRs poses challenges in efficiently assimilating the latest evidence to provide personalized evidence-based care. In this paper, we present our solution to the SemEval- 2024 Task 2 titled “Safe Biomedical Natural Language Inference for Clinical Trials”. Given a statement and one/two CTRs as inputs, the task is to determine whether or not the statement entails or contradicts the CTRs. We explore both generative and discriminative large language models (LLM) to investigate their performance for clinical inference. Moreover, we contrast the general-purpose LLMs with the ones specifically tailored for the clinical domain to study the potential advantage in mitigating distributional shifts. Furthermore, the benefit of augmenting additional knowledge within the prompt/statement is examined in this work. Our empirical study suggests that DeBERTa-lg, a discriminative general-purpose natural language inference model, obtains the highest F1 score of 0.77 on the test set, securing the fourth rank on the leaderboard. Intriguingly, the augmentation of knowledge yields subpar results across most cases.

2023

pdf bib
UM-DFKI Maltese Speech Translation
Aiden Williams | Kurt Abela | Rishu Kumar | Martin Bär | Hannah Billinghurst | Kurt Micallef | Ahnaf Mozib Samin | Andrea DeMarco | Lonneke van der Plas | Claudia Borg
Proceedings of the 20th International Conference on Spoken Language Translation (IWSLT 2023)

For the 2023 IWSLT Maltese Speech Translation Task, UM-DFKI jointly presents a cascade solution which achieves 0.6 BLEU. While this is the first time that a Maltese speech translation task has been released by IWSLT, this paper explores previous solutions for other speech translation tasks, focusing primarily on low-resource scenarios. Moreover, we present our method of fine-tuning XLS-R models for Maltese ASR using a collection of multi-lingual speech corpora as well as the fine-tuning of the mBART model for Maltese to English machine translation.

pdf bib
garNER at SemEval-2023: Simplified Knowledge Augmentation for Multilingual Complex Named Entity Recognition
Md Zobaer Hossain | Averie Ho Zoen So | Silviya Silwal | H. Andres Gonzalez Gongora | Ahnaf Mozib Samin | Jahedul Alam Junaed | Aritra Mazumder | Sourav Saha | Sabiha Tahsin Soha
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper presents our solution, garNER, to the SemEval-2023 MultiConer task. We propose a knowledge augmentation approach by directly querying entities from the Wikipedia API and appending the summaries of the entities to the input sentence. These entities are either retrieved from the labeled training set (Gold Entity) or from off-the-shelf entity taggers (Entity Extractor). Ensemble methods are then applied across multiple models to get the final prediction. Our analysis shows that the added contexts are beneficial only when such contexts are relevant to the target-named entities, but detrimental when the contexts are irrelevant.

2022

pdf bib
Arguments to Key Points Mapping with Prompt-based Learning
Ahnaf Mozib Samin | Behrooz Nikandish | Jingyan Chen
Proceedings of the 5th International Conference on Natural Language and Speech Processing (ICNLSP 2022)