S. Rajendran

Also published as: Rajendran S


2026

The deployment of Large Language Models(LLMs) has intensified concerns regarding thepropagation of societal stereotypes encodedwith web-scale training corpora. This pa-per presents a dual-paradigm framework spe-cially designed to address multilingual gender-inclusvity and counterfactual generation. Formultilingual gender-neutral text transformation,a fine-tuned mT5 encoder–decoder model per-forms controlled sentence rewriting with mini-mal edits while preserving semantic fidelity andgrammatical fluency. For counter-narrative gen-eration, the Llama-3 8B decoder-only model isemployed to generate empathetic and persua-sive responses through structured prompt-basedgeneration. The framework is evaluated usingdatasets from the LT-EDI ACL 2026 sharedtask across multiple languages, including En-glish, Tamil, Kannada, German, and Spanish.Experimental results demonstrate strong effec-tiveness in identifying and neutralizing gendermarkers, particularly in morphologically richlanguages, while the counter-narrative compo-nent achieves high performance in politeness,coherence, and relevance. Overall, the pro-posed approach contributes toward the develop-ment of responsible and inclusive multilingualNLP systems.

2008

We present a universal Parts-of-Speech (POS) tagset framework covering most of the Indian languages (ILs) following the hierarchical and decomposable tagset schema. In spite of significant number of speakers, there is no workable POS tagset and tagger for most ILs, which serve as fundamental building blocks for NLP research. Existing IL POS tagsets are often designed for a specific language; the few that have been designed for multiple languages cover only shallow linguistic features ignoring linguistic richness and the idiosyncrasies. The new framework that is proposed here addresses these deficiencies in an efficient and principled manner. We follow a hierarchical schema similar to that of EAGLES and this enables the framework to be flexible enough to capture rich features of a language/ language family, even while capturing the shared linguistic structures in a methodical way. The proposed common framework further facilitates the sharing and reusability of scarce resources in these languages and ensures cross-linguistic compatibility.