2025
pdf
bib
Osint at AraGenEval shared task: Fine-Tuned Modeling for Tracking Style Signatures and AI Generation in Arabic Texts
Shifali Agrahari
|
Hemanth Prakash Simhadri
|
Ashutosh Kumar Verma
|
Ranbir Singh Sanasam
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks
pdf
bib
abs
Teaching Sarcasm: Few-Shot Multimodal Sarcasm Detection via Distillation to a Parameter-Efficient Student
Soumyadeep Jana
|
Ranbir Singh Sanasam
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Multimodal sarcasm detection is challenging, especially in low-resource settings where subtle image-text contradictions are hard to learn due to scarce annotated data, which hinders the model’s performance. Parameter-efficient fine-tuning (PEFT) methods like adapters, LoRA, and prompt tuning reduce overfitting but struggle to reach optimal performance due to limited supervision from few-shot data. We propose PEKD, a unified framework that enhances PEFT methods via distillation from an expert model trained on large-scale sarcasm data, which acts as the teacher. To mitigate unreliable signals from the teacher, we introduce an entropy-aware gating mechanism that dynamically adjusts the distillation strength based on teacher confidence. Experiments on two public datasets demonstrate that our PEKD framework enables PEFT methods to outperform both prior parameter-efficient approaches and large multimodal models, achieving strong results in the few-shot scenario. The framework is modular and adaptable to a wide range of multimodal models and tasks.
pdf
bib
abs
Can You Really Trust That Review? ProtoFewRoBERTa and DetectAIRev: A Prototypical Few-Shot Method and Multi-Domain Benchmark for Detecting AI-Generated Reviews
Shifali Agrahari
|
Sujit Kumar
|
Ranbir Singh Sanasam
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Synthetic reviews mislead users and erode trust in online marketplaces, and the advent of Large Language Models (LLMs) makes detecting such AI-generated content increasingly challenging due to their human-like fluency and coherence. In the literature, LLM-generated review detection datasets are limited to one or a few domains, with reviews generated by only a few LLMs. Consequently, datasets are limited in diversity in terms of both domain coverage and review generation styles. Models trained on such datasets generalize poorly, lacking cross-model adaptation and struggling to detect diverse LLM-generated reviews in real-world, open-domain scenarios. To address this, we introduce DetectAIRev, a benchmark dataset for AI-generated review detection that includes human-written reviews from diverse domains and AI-generated reviews generated by various categories of LLMs. We evaluate the quality and reliability of the proposed dataset through several ablation studies and human evaluations. Furthermore, we propose an AI-generated text detection method ProtoFewRoBERTa, a few-shot framework that combines prototypical networks with RoBERTa embeddings, which learn discriminative features across multiple LLMs and human-written text using only a few labeled examples per class to discriminate between LLMs as the author for text author detection. We conduct our experiments on the DetectAIRev and a publicly available benchmark dataset. Our experimental results suggest that our proposed methods outperform the state-of-the-art baseline models in detecting AI-generated reviews and text detection.
pdf
bib
abs
An Information-Theoretic Approach to Reducing Fertility in LLMs for Manipuri Machine Translation
Telem Joyson Singh
|
Ranbir Singh Sanasam
|
Priyankoo Sarmah
Proceedings of the 14th International Joint Conference on Natural Language Processing and the 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics
Large language models (LLMs) have transformed machine translation, yet they have a high subword fertility issue for low-resource languages, which leads to slow inference speed and increased costs. While vocabulary expansion via continual pre-training is a common solution, it often degrades translation quality and requires large target-language corpora, which are unavailable for truly low-resource languages. To address this, we investigate tokenization efficiency through an information-theoretic lens, building on the established hypothesis that word length correlates with information content. From this perspective, we characterize tokenization inefficiency as having high fertility for low-information (highly predictable) words. Guided by this principle, we introduce a novel fine-tuning strategy that systematically identifies informationally redundant words—those with high fertility but low information content—for targeted vocabulary expansion and model fine-tuning. Experiments fine-tuning BLOOM and LLaMA-3 in English-Manipuri and other two language pairs show that our proposed method significantly reduces fertility by 50% and accelerates inference by more than 2 times, without compromising and often exceeding the translation quality of standard LLM baselines, providing a theoretically grounded solution for efficient LLM-based MT.
2024
pdf
bib
abs
Continuous Attentive Multimodal Prompt Tuning for Few-Shot Multimodal Sarcasm Detection
Soumyadeep Jana
|
Animesh Dey
|
Ranbir Singh Sanasam
Proceedings of the 28th Conference on Computational Natural Language Learning
With the steep rise in multimodal content on social media, multimodal sarcasm detection has gained widespread attention from research communities. Existing studies depend on large-scale data, which is challenging to obtain and expensive to annotate. Thus, investigating this problem in a few-shot scenario is required. Overtly complex multimodal models are prone to overfitting on in-domain data, which hampers their performance on out-of-distribution (OOD) data. To address these issues, we propose Continuous Attentive Multimodal Prompt Tuning model (CAMP), that leverages the prompt tuning paradigm to handle few-shot multimodal sarcasm detection. To overcome the siloed learning process of continuous prompt tokens, we design a novel, continuous multimodal attentive prompt where the continuous tokens intricately engage with both image and text tokens, enabling the assimilation of knowledge from different input modalities. Experimental results indicate that our method outperforms other multimodal baseline methods in the few-shot setting and OOD scenarios.