Debmalya Pal

2026

Prompt Stylometry for On-Device Affect-Adaptive AI: A Feasibility Study in Linguistic Signal Detection and Response Steering
Debmalya Pal
Proceedings of the Seventh Workshop on Privacy in Natural Language Processing

Every user prompt contains latent linguistic signals beyond its explicit semantic content: lexical choice, hedging, sentence structure, and discourse patterns, that reflect the user’s affective state and cognitive style. Yet most large language models are optimized for generalized assistant behavior rather than explicit adaptation to these fine-grained signals. We introduce Prompt Stylometry, a framework for detecting affective and cognitive-style signals directly from user prompts and using them to steer response generation. We study two categories of signals: affect-related cues associated with emotional states, and cognitive-style cues associated with patterns such as analytical, exploratory, self-critical, or indecisive reasoning. This inference capability, however, creates substantial privacy risks: any system processing prompts server-side could implicitly profile users’ psychological states without their knowledge or consent. This motivates our core design choice of a fully on-device architecture in which no interaction data leaves the user’s device. We benchmark three annotation paradigms, lexicon-based, neural, and generative, across 600 synthetic prompts spanning 30 stylometric profiles, and evaluate affect-adaptive response steering across two small language model families under 5B parameters. Our results show systematic differences in both signal detection behavior and downstream steering responsiveness across annotation methods and model families, demonstrating the feasibility of privacy-preserving affect-adaptive AI on consumer hardware while identifying annotation paradigm sensitivity and cross-profile transfer as key open challenges.

pdf bib abs

Probing and Steering Uncertainty in Biomedical Language Models: Representational Structure and Behavioral Limits
Debmalya Pal
BioNLP 2026

Biomedical language models can generate overly confident clinical statements despite incomplete or ambiguous evidence. We study whether linguistic uncertainty (the hedged epistemic stance expressed in phrases such as "consistent with" or "cannot exclude") is encoded in model representations and can be controlled without retraining. Across six biomedical language models spanning two architectures (causal decoders and bidirectional encoders), we show that uncertainty is captured by robust low-dimensional linear structure in hidden states. We then apply activation steering to manipulate this representation directly, increasing hedged generation in decoder models and inducing targeted uncertainty related shifts in encoder representations. Together, these results show that epistemic stance is not merely a surface linguistic phenomenon but an interpretable and controllable feature of biomedical language model representations, with implications for safer and more calibrated clinical text generation.

Co-authors

Venues

Fix author