V. V. Saradhi

2026

CIARAM: Class Imbalance Aware Generative Framework for Relational Argument Mining
Nilmadhab Das | Sayan Pal | V. V. Saradhi | Ashish Anand
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Relational Argument Mining (RAM) is a key task of computational argumentation, which aims to classify the relationships such as Support or Attack between argument component (AC) pairs. Traditional approaches primarily rely on graph-based modelling with external knowledge sources, which are complex in nature. Also, these approaches struggle with RAM datasets when relation classes are imbalanced, as they are not designed for class-imbalanced scenarios. In this work, we propose CIARAM framework to reformulate RAM as a text-to-text generation problem to generate relational labels in a flattened text format. To address the class imbalance, we employ a data augmentation strategy using a decoder-only Large Language Model (LLM) to balance the underrepresented relation classes. Across five standard RAM benchmarks, CIARAM produces strong results, specifically with the billion-parameter model, with a substantial gain in performance compared to the latest baseline, demonstrating the strong potential of our approach.

pdf bib abs

APTFiNER: Annotation Preserving Translation for Fine-grained Named Entity Recognition
Prachuryya Kaushik | Adittya Gupta | Ajanta Maurya | Gautam Sharma | V. V. Saradhi | Ashish Anand
Proceedings of the Fifteenth Language Resources and Evaluation Conference

We present APTFiNER, a novel fine-grained named entity recognition (FgNER) dataset covering six low-resource Indian languages spoken by over 400 million people across various nations. While creating FgNER resources through manual annotation is typically expensive and labor-intensive, distant supervision has emerged as a workable alternative. Yet, such FgNER datasets are often noisy, as each entity mentions are often assigned multiple entity types, which necessitates computationally demanding noise-aware models. Furthermore, resources for both coarse-grained and fine-grained NER tasks remain scarce for low-resource languages. To overcome this scarcity, we utilized the superior reasoning and translation capability of Gemini through the proposed annotation-preserving translation method and created a large-scale FgNER dataset comprising over 411 thousand sentences, 697 thousand entity mentions, and 5.8 million tokens in total. We translated the MultiCoNER2 English FgNER dataset to the target languages: Assamese (as), Marathi (mr), Nepali (ne), Tamil (ta), Telugu (te), and a vulnerable language, Bodo (brx). Through rigorous analyses and human evaluations, the effectiveness of our method and the high quality of the resulting dataset are ascertained with F1 score improvements of 8% in both Tamil and Telugu, and 25% in Marathi over the current state-of-the-art. The dataset, expert detector models, the agentic tool, and the interactive web application are available as open-source resources at: <url>https://hf.co/collections/prachuryyaIITG/aptfiner</url>.

Co-authors

Sayan Pal 1

Gautam Sharma 1

Venues

LREC2

Fix author