Gautam Kumar


2026

This paper describes a system for the CLPsych 2026 shared task that uses retrieval-augmented in-context learning with frozen LLMs and no fine-tuning. The core contribution is a five-agent agentic pipeline for Task 3.1 sequence summarisation: two rule-based agents detect change type (Switch/Escalation) and direction (improvement/deterioration), an LLM-based DynamicsExtractor produces structured ABCD analysis, a SummaryWriter composes prose grounded in retrieved gold exemplars, and a Validator enforces structural constraints. This pipeline is iteratively refined across three submissions via NLI-based candidate reranking and per-sentence contradiction reduction. For Tasks 1.1 and 1.2, a single LLM call combines static and RAG-retrieved examples; for Task 2, an auto-tuned prompt detects moments of change. The system ranked 1st on Task 1.2 (RMSE 0.917) and Task 3.1 (score rank average 4.00), 3rd on Task 1.1 (F1 0.420), and 8th on Task 2 (F1 0.466).

2023

Patent applicants write patent specificationsthat describe embodiments of inventions. Some embodiments are claimed for a patent,while others may be unclaimeddue to strategic considerations. Unclaimed embodiments may be extracted byapplicants later and claimed incontinuing applications togain advantages over competitors. Despite being essential for corporate intellectual property (IP) strategies,unclaimed embodiment extraction is conducted manually,and little research has been conducted on its automation. This paper presents a novel task ofunclaimed embodiment extraction (UEE)and a novel dataset for the task. Our experiments with Transformer-based modelsdemonstratedthat the task was challenging as it requiredconducting natural language inference onpatent specifications, which consisted oftechnical, long, syntactically and semanticallyinvolved sentences. We release the dataset and code to foster this new area of research.

2022

There has been significant progress in the field of sentiment analysis. However, aspect-based sentiment analysis (ABSA) has not been explored in the Japanese language even though it has a huge scope in many natural language processing applications such as 1) tracking sentiment towards products, movies, politicians etc; 2) improving customer relation models. The main reason behind this is that there is no standard Japanese dataset available for ABSA task. In this paper, we present the first standard Japanese dataset for the hotel reviews domain. The proposed dataset contains 53,192 review sentences with seven aspect categories and two polarity labels. We perform experiments on this dataset using popular ABSA approaches and report error analysis. Our experiments show that contextual models such as BERT works very well for the ABSA task in the Japanese language and also show the need to focus on other NLP tasks for better performance through our error analysis.