A baseline for self-state identification and classification in mental health data: CLPsych 2025 Task

Laerdon Kim

A baseline for self-state identification and classification in mental health data: CLPsych 2025 Task

Abstract

We present a baseline for the CLPsych 2025 A.1 task: classifying self-states in mental health data taken from Reddit. We use few-shot learning with a 4-bit quantized Gemma 2 9B model (Gemma Team, 2024; Brown et al., 2020; Daniel Han and team, 2023) and a data preprocessing step which first identifies relevant sentences indicating self-state evidence, and then performs a binary classification to determine whether the sentence is evidence of an adaptive or maladaptive self-state. This system outperforms our other method which relies on an LLM to highlight spans of variable length independently. We attribute the performance of our model to the benefits of this sentence chunking step for two reasons: partitioning posts into sentences 1) broadly matches the granularity at which self-states were human-annotated and 2) simplifies the task for our language model to a binary classification problem. Our system placed third out of fourteen systems submitted for Task A.1, earning a test-time recall of 0.579.

Anthology ID:: 2025.clpsych-1.17
Volume:: Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2025)
Month:: May
Year:: 2025
Address:: Albuquerque, New Mexico
Editors:: Ayah Zirikly, Andrew Yates, Bart Desmet, Molly Ireland, Steven Bedrick, Sean MacAvaney, Kfir Bar, Yaakov Ophir
Venues:: CLPsych | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 218–224
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.clpsych-1.17/
DOI:
Bibkey:
Cite (ACL):: Laerdon Kim. 2025. A baseline for self-state identification and classification in mental health data: CLPsych 2025 Task. In Proceedings of the 10th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2025), pages 218–224, Albuquerque, New Mexico. Association for Computational Linguistics.
Cite (Informal):: A baseline for self-state identification and classification in mental health data: CLPsych 2025 Task (Kim, CLPsych 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.clpsych-1.17.pdf

PDF Cite Search Fix data