Rajath Rao


2025

pdf bib
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
Rajath Rao | Adithya V Ganesan | Oscar Kjell | Jonah Luby | Akshay Raghavan | Scott M. Feltman | Whitney Ringwald | Ryan L. Boyd | Benjamin J. Luft | Camilo J. Ruggero | Neville Ryant | Roman Kotov | H. Schwartz
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Current speech encoding pipelines often rely on an additional text-based LM to get robust representations of human communication, even though SotA speech-to-text models often have a LM within. This work proposes an approach to improve the LM within an audio model such that the subsequent text-LM is unnecessary. We introduce **WhiSPA** (**Whi**sper with **S**emantic and **P**sychological **A**lignment), which leverages a novel audio training objective: contrastive loss with a language model embedding as a teacher. Using over 500k speech segments from mental health audio interviews, we evaluate the utility of aligning Whisper’s latent space with semantic representations from a text autoencoder (SBERT) and lexically derived embeddings of basic psychological dimensions: emotion and personality. Over self-supervised affective tasks and downstream psychological tasks, WhiSPA surpasses current speech encoders, achieving an average error reduction of 73.4% and 83.8%, respectively. WhiSPA demonstrates that it is not always necessary to run a subsequent text LM on speech-to-text output in order to get a rich psychological representation of human communication.

2024

pdf bib
Archetypes and Entropy: Theory-Driven Extraction of Evidence for Suicide Risk
Vasudha Varadarajan | Allison Lahnala | Adithya V Ganesan | Gourab Dey | Siddharth Mangalik | Ana-Maria Bucur | Nikita Soni | Rajath Rao | Kevin Lanning | Isabella Vallejo | Lucie Flek | H. Andrew Schwartz | Charles Welch | Ryan Boyd
Proceedings of the 9th Workshop on Computational Linguistics and Clinical Psychology (CLPsych 2024)

Research on psychological risk factors for suicide has developed for decades. However, combining explainable theory with modern data-driven language model approaches is non-trivial. In this study, we propose and evaluate methods for identifying language patterns aligned with theories of suicide risk by combining theory-driven suicidal archetypes with language model-based and relative entropy-based approaches. Archetypes are based on prototypical statements that evince risk of suicidality while relative entropy considers the ratio of how unusual both a risk-familiar and unfamiliar model find the statements. While both approaches independently performed similarly, we find that combining the two significantly improved the performance in the shared task evaluations, yielding our combined system submission with a BERTScore Recall of 0.906. Consistent with the literature, we find that titles are highly informative as suicide risk evidence, despite the brevity. We conclude that a combination of theory- and data-driven methods are needed in the mental health space and can outperform more modern prompt-based methods.