Gautam Kumar

2026

Agentic Pipelines Meet Retrieval-Augmented ICL: A Zero-Training Approach to Mental Health Modeling
Anson Antony | Gautam Kumar | Annika Marie Schoene
Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2026)

This paper describes a system for the CLPsych 2026 shared task that uses retrieval-augmented in-context learning with frozen LLMs and no fine-tuning. The core contribution is a five-agent agentic pipeline for Task 3.1 sequence summarisation: two rule-based agents detect change type (Switch/Escalation) and direction (improvement/deterioration), an LLM-based DynamicsExtractor produces structured ABCD analysis, a SummaryWriter composes prose grounded in retrieved gold exemplars, and a Validator enforces structural constraints. This pipeline is iteratively refined across three submissions via NLI-based candidate reranking and per-sentence contradiction reduction. For Tasks 1.1 and 1.2, a single LLM call combines static and RAG-retrieved examples; for Task 2, an auto-tuned prompt detects moments of change. The system ranked 1st on Task 1.2 (RMSE 0.917) and Task 3.1 (score rank average 4.00), 3rd on Task 1.1 (F1 0.420), and 8th on Task 2 (F1 0.466).

2023

pdf bib abs

Hunt for Buried Treasures: Extracting Unclaimed Embodiments from Patent Specifications
Chikara Hashimoto | Gautam Kumar | Shuichiro Hashimoto | Jun Suzuki
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track)

Patent applicants write patent specificationsthat describe embodiments of inventions. Some embodiments are claimed for a patent,while others may be unclaimeddue to strategic considerations. Unclaimed embodiments may be extracted byapplicants later and claimed incontinuing applications togain advantages over competitors. Despite being essential for corporate intellectual property (IP) strategies,unclaimed embodiment extraction is conducted manually,and little research has been conducted on its automation. This paper presents a novel task ofunclaimed embodiment extraction (UEE)and a novel dataset for the task. Our experiments with Transformer-based modelsdemonstratedthat the task was challenging as it requiredconducting natural language inference onpatent specifications, which consisted oftechnical, long, syntactically and semanticallyinvolved sentences. We release the dataset and code to foster this new area of research.

2022

pdf bib abs

A Large-Scale Japanese Dataset for Aspect-based Sentiment Analysis
Yuki Nakayama | Koji Murakami | Gautam Kumar | Sudha Bhingardive | Ikuko Hardaway
Proceedings of the Thirteenth Language Resources and Evaluation Conference

There has been significant progress in the field of sentiment analysis. However, aspect-based sentiment analysis (ABSA) has not been explored in the Japanese language even though it has a huge scope in many natural language processing applications such as 1) tracking sentiment towards products, movies, politicians etc; 2) improving customer relation models. The main reason behind this is that there is no standard Japanese dataset available for ABSA task. In this paper, we present the first standard Japanese dataset for the hotel reviews domain. The proposed dataset contains 53,192 review sentences with seven aspect categories and two polarity labels. We perform experiments on this dataset using popular ABSA approaches and report error analysis. Our experiments show that contextual models such as BERT works very well for the ABSA task in the Japanese language and also show the need to focus on other NLP tasks for better performance through our error analysis.

Co-authors

Koji Murakami 1

Yuki Nakayama 1

Annika Marie Schoene 1

Jun Suzuki 1

Venues

Fix author