Rudra Mutalik

2025

CPIQA: Climate Paper Image Question Answering Dataset for Retrieval-Augmented Generation with Context-based Query Expansion
Rudra Mutalik | Abiram Panchalingam | Loitongbam Gyanendro Singh | Timothy J. Osborn | Ed Hawkins | Stuart E. Middleton
Proceedings of the 2nd Workshop on Natural Language Processing Meets Climate Change (ClimateNLP 2025)

Misinformation about climate science is a serious challenge for our society. This paper introduces CPIQA (Climate Paper Image Question-Answering), a new question-answer dataset featuring 4,551 full-text open-source academic papers in the area of climate science with 54,612 GPT-4o generated question-answer pairs. CPIQA contains four question types (numeric, figure-based, non-figure-based, reasoning), each generated using three user roles (expert, non-expert, climate sceptic). CPIQA is multimodal, incorporating information from figures and graphs with GPT-4o descriptive annotations. We describe Context-RAG, a novel method for RAG prompt decomposition and augmentation involving extracting distinct contexts for the question. Evaluation results for Context-RAG on the benchmark SPIQA dataset outperforms the previous best state of the art model in two out of three test cases. For our CPIQA dataset, Context-RAG outperforms our standard RAG baseline on all five base LLMs we tested, showing our novel contextual decomposition method can generalize to any LLM architecture. Expert evaluation of our best performing model (GPT-4o with Context-RAG) by climate science experts highlights strengths in precision and provenance tracking, particularly for figure-based and reasoning questions.

2024

pdf bib abs

Extracting and Summarizing Evidence of Suicidal Ideation in Social Media Contents Using Large Language Models
Loitongbam Gyanendro Singh | Junyu Mao | Rudra Mutalik | Stuart E. Middleton
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

This paper explores the use of Large Language Models (LLMs) in analyzing social media content for mental health monitoring, specifically focusing on detecting and summarizing evidence of suicidal ideation. We utilized LLMs Mixtral7bx8 and Tulu-2-DPO-70B, applying diverse prompting strategies for effective content extraction and summarization. Our methodology included detailed analysis through Few-shot and Zero-shot learning, evaluating the ability of Chain-of-Thought and Direct prompting strategies. The study achieved notable success in the CLPsych 2024 shared task (ranked top for the evidence extraction task and second for the summarization task), demonstrating the potential of LLMs in mental health interventions and setting a precedent for future research in digital mental health monitoring.

Co-authors

Abiram Panchalingam 1

Venues

Fix author