NLP-AI4Health

Parameswari Krishnamurthy, Vandan Mujadia, Dipti Misra Sharma, Hannah Mary Thomas (Editors)


Anthology ID:
2025.nlpai4health-main
Month:
December
Year:
2025
Address:
Mumbai, India
Venues:
NLP-AI4Health | WS
SIG:
Publisher:
Association for Computational Linguistics
URL:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.nlpai4health-main/
DOI:
ISBN:
979-8-89176-315-9
Bib Export formats:
BibTeX
PDF:
https://preview.aclanthology.org/ingest-ijcnlp-aacl/2025.nlpai4health-main.pdf

pdf bib
NLP-AI4Health
Parameswari Krishnamurthy | Vandan Mujadia | Dipti Misra Sharma | Hannah Mary Thomas

pdf bib
Enhancing Patient-Centric Healthcare Communication Through Multimodal Emotion Recognition: A Transformer-Based Framework for Clinical Decision Support
Vineet Channe

This paper presents a multimodal emotion analysis framework designed to enhance patient-centric healthcare communication and support clinical decision-making. Our system addresses automated patient emotion monitoring during consultations, telemedicine sessions, and mental health screenings by combining audio transcription, facial emotion analysis, and text processing. Using emotion patterns from the CREMA-D dataset as a foundation for healthcare-relevant emotional expressions, we introduce a novel emotion-annotated text format “[emotion] transcript [emotion]” integrating Whisper-based audio transcription with DeepFace facial emotion analysis. We systematically evaluate eight transformer architectures (BERT, RoBERTa, DeBERTa, XLNet, ALBERT, DistilBERT, ELECTRA, and BERT-base) for three-class clinical emotion classification: Distress/Negative (anxiety, fear), Stable/Neutral (baseline), and Engaged/Positive (comfort). Our multimodal fusion strategy achieves 86.8% accuracy with DeBERTa-v3-base, representing a 12.6% improvement over unimodal approaches and meeting clinical requirements for reliable patient emotion detection. Cross-modal attention analysis reveals facial expressions provide crucial disambiguation, with stronger attention to negative emotions (0.41 vs 0.28), aligning with clinical priorities for detecting patient distress. Our contributions include emotion-annotated text representation for healthcare contexts, systematic transformer evaluation for clinical deployment, and a framework enabling real-time patient emotion monitoring and emotionally-aware clinical decision support.

pdf bib
MOD-KG: MultiOrgan Diagnosis Knowledge Graph
Anas Anwarul Haq Khan | Pushpak Bhattacharyya

The human body is highly interconnected, where a diagnosis in one organ can influence conditions in others. In medical research, graphs (such as Knowledge Graphs and Causal Graphs) have proven useful for capturing these relationships, but constructing them manually with expert input is both costly and time-intensive, especially given the continuous flow of new findings. To address this, we leverage the extraction capabilities of large language models (LLMs) to build the **MultiOrgan Diagnosis Knowledge Graph (MOD-KG)**. MOD-KG contains over **21,200 knowledge triples**, derived from both textbooks **(~13%)** and carefully selected research papers (with an average of **444** citations each). The graph focuses primarily on the *heart, lungs, kidneys, liver, pancreas, and brain*, which are central to much of today’s multimodal imaging research. The extraction quality of the LLM was benchmarked against baselines over **1000** samples, demonstrating reliability. We will make our dataset public upon acceptance.

pdf bib
Cross-Lingual Mental Health Ontologies for Indian Languages: Bridging Patient Expression and Clinical Understanding through Explainable AI and Human-in-the-Loop Validation
Ananth Kandala | Ratna Kandala | Akshata Kishore Moharir | Niva Manchanda | Sunaina Singh Rathod

Mental health communication in India is linguistically fragmented, culturally diverse, and often underrepresented in clinical NLP. Current health ontologies and mental health resources are dominated by English or Western-centric diagnostic frameworks, leaving a gap in representing patient distress expressions in Indian languages. We propose the Cross-Lingual Graphs of Patient Distress Expressions (CL-PDE), a framework for building cross-lingual mental health ontologies through graph-based methods that capture culturally embedded expressions of distress, align them across languages, and link them with clinical terminology. Our approach addresses critical gaps in healthcare communication by grounding AI systems in culturally valid representations, enabling more inclusive and patient-centric NLP tools for mental health care in multilingual contexts.

pdf bib
Automated Coding of Counsellor and Client Behaviours in Motivational Interviewing Transcripts: Validation and Application
Armaity Katki | Nathan Choi | Son Sophak Otra | George Flint | Kevin Zhu | Sunishchal Dev

Protein language models (PLMs) are powerful tools for protein engineering, but remain difficult to steer toward specific biochemical properties, where small sequence changes can affect stability or function. We adapt two prominent unsupervised editing methods: task arithmetic (TA; specifically, Forgetting via Negation) in weight space and feature editing with a sparse autoencoder (SAE) in activation space. We evaluate their effects on six biochemical properties of generations from three PLMs (ESM3, ProGen2-Large, and ProLLaMA). Across models, we observe complementary efficacies: TA more effectively controls some properties while SAE more effectively controls others. Property response patterns show some consistence across models. We suggest that the response pattern of biochemical properties should be considered when steering PLMs.

pdf bib
Patient-Centric Question Answering- Overview of the Shared Task at the Second Workshop on NLP and AI for Multilingual and Healthcare Communication
Arun Zechariah | Balu Krishna | Hannah Mary Thomas | Joy Mammen | Dipti Misra Sharma | Parameswari Krishnamurthy | Vandan Mujadia | Priyanka Dasari | Vishnuraj Arjunaswamy

This paper presents an overview of the Shared Task on Patient-Centric Question Answering, organized as part of the NLP-AI4Health workshop at IJCNLP. The task aims to bridge the digital divide in healthcare by developing inclusive systems for two critical domains: Head and Neck Cancer (HNC) and Cystic Fibrosis (CF). We introduce the NLP4Health-2025 Dataset, a novel, large-scale multilingual corpus consisting of more than 45,000 validated multi-turn dialogues between patients and healthcare providers across 10 languages: Assamese, Bangla, Dogri, English, Gujarati, Hindi, Kannada, Marathi, Tamil, and Telugu. Participants were challenged to develop lightweight models (< 3 billion parameters) to perform two core activities: (1) Clinical Summarization, encompassing both abstractive summaries and structured clinical extraction (SCE), and (2) Patient-Centric QA, generating empathetic, factually accurate answers in the dialogue native language. This paper details the hybrid human-agent dataset construction pipeline, task definitions, evaluation metrics, and analyzes the performance of 9 submissions from 6 teams. The results demonstrate the viability of small language models (SLMs) in low-resource medical settings when optimized via techniques like LoRA and RAG.

pdf bib
Multilingual Clinical Dialogue Summarization and Information Extraction with Qwen-1.5B LoRA
Kunwar Zaid | Amit Sangroya | Jyotsana Khatri

This paper describes our submission to theNLP-AI4Health 2025 Shared Task on multi-lingual clinical dialogue summarization andstructured information extraction. Our systemis based on Qwen-1.5B Instruct fine-tuned withLoRA adapters for parameter-efficient adapta-tion. The pipeline produces (i) concise Englishsummaries, (ii) schema-aligned JSON outputs,and (iii) multilingual Q&A responses. TheQwen-based approach substantially improvessummary fluency, factual completeness, andJSON field coverage while maintaining effi-ciency within constrained GPU resources.

pdf bib
Patient-Centric Multilingual Question Answering and Summary Generation for Head and Neck Cancer and Blood Donation
Amol Shinde | Saloni Chitte | Prakash B. Pimpale

This paper describes a production minded multilingual system built for the NLP-AI4Health shared task, designed to produce concise, medically accurate summaries and patient friendly answers for Head and Neck Cancer (HNC) and Blood Donation. We finetune Gemma2-2B under a strict model size constraint (<3B parameters) using parameter efficient adaptation (LoRA) and practical engineering to handle long dialogues, code mixing, and multilingual scripts. The pipeline couples careful preprocessing, token aware chunking, and constrained decoding with lightweight retrieval and verification steps. We report per language quantitative metrics and provide an analysis of design choices and operational considerations for real world deployment.

pdf bib
SAHA: Samvad AI for Healthcare Assistance
Aditya Kumar | Rakesh Kumar Nayak | Janhavi Naik | Ritesh Kumar | Dhiraj Bhatia | Shreya Agarwal

This paper deals with the dual task of developing a medical question answering (QA) system and generating concise summaries of medical dialogue data across nine languages (English and eight Indian languages). The medical dialogue data focuses on two critical health issues: Head and Neck Cancer (HNC) and Cystic Fibrosis (NLP AI4health shared task). The proposed framework utilises a dual approach: a fine-tuned small Multilingual Text-to-Text Transfer Transformer (mT5) model for the conversational summarisation component and a fine-tuned Retrieval Augmented Generation (RAG) system integrating the dense intfloat/e5-large language model for the language-independent QA component. The efficacy of the proposed approaches is demonstrated by achieving promising precision in the QA task. Our framework achieved the highest F1 scores in QA for the three Indian languages, with F1 score of 0.3995 in Marathi, 0.7803 in Bangla, and 0.74759 in Hindi, respectively. We achieved the highest cometscore of 0.5626 on the Gujarati QA test set. For the dialogue summarisation task, our model registered the highest ROUGE-2 and ROUGE-L precision across all eight Indian languages, with English being the sole exception. These results confirm our approach potential to improve e-health in dialogue data for low-resource Indian languages.

pdf bib
MedQwen-PE: Medical Qwen for Parameter-Efficient Multilingual Patient-Centric Summarization, Question Answering and Information Extraction
Vinay Babu Ulli | Anindita Mondal

This study addresses the Shared Task on Patient-Centric Multilingual Question Answering, which focuses on generating summaries and patient-oriented answers from multi-turn medical dialogues related to Head and Neck Cancer and Cystic Fibrosis across ten languages. The Qwen3-1.7B model is fine-tuned using QLoRA for three tasks—Summarization, Question Answering, and Information Extraction—while updating only approximately 1.6% of parameters through task-specific adapter layers. The resulting system demonstrates strong semantic fidelity, as evidenced by high BERTScore and COMET scores, particularly for Kannada, English, Telugu, and Tamil, with comparatively lower performance in Assamese, Bangla, Gujarati, and Marathi. The modular fine-tuning design enables efficient task adaptation while satisfying the constraints on model size and computational resources.

pdf bib
NLP4Health: Multilingual Clinical Dialogue Summarization and QA with mT5 and LoRA
Moutushi Roy | Dipankar Das

In this work, we present NLP4Health, a unified and reproducible pipeline to accomplish the tasks of multilingual clinical dialogue summarization and question answering (QA). Our system fine-tunes the multilingual sequence-to-sequence model google/mt5-base along with parameter-efficient Low-Rank Adaptation (LoRA) modules to support ten Indian languages. For each clinical dialogue, the model produces (1) a free-text English summary, (2) an English structured key–value (KnV) JSON summary, and (3) QA responses in the dialogue’s original language. We conducted preprocessing, fine-tuning, and inference, and evaluated across QA, textual, and structured metrics, analyzing performance in low-resource settings. The adapter weights, tokenizer, and inference scripts are publicly released to promote transparency and reproducibility.