Nathan Choi


2025

pdf bib
Automated Coding of Counsellor and Client Behaviours in Motivational Interviewing Transcripts: Validation and Application
Armaity Katki | Nathan Choi | Son Sophak Otra | George Flint | Kevin Zhu | Sunishchal Dev
NLP-AI4Health

Protein language models (PLMs) are powerful tools for protein engineering, but remain difficult to steer toward specific biochemical properties, where small sequence changes can affect stability or function. We adapt two prominent unsupervised editing methods: task arithmetic (TA; specifically, Forgetting via Negation) in weight space and feature editing with a sparse autoencoder (SAE) in activation space. We evaluate their effects on six biochemical properties of generations from three PLMs (ESM3, ProGen2-Large, and ProLLaMA). Across models, we observe complementary efficacies: TA more effectively controls some properties while SAE more effectively controls others. Property response patterns show some consistence across models. We suggest that the response pattern of biochemical properties should be considered when steering PLMs.