Eric Xia


2025

pdf bib
Turn-by-Turn Behavior Monitoring in LM-Guided Psychotherapy
Anish Sai Chedalla | Samina Ali | Jiuming Chen | Starborn0128@gmail.com Starborn0128@gmail.com | Eric Xia
The 14th International Joint Conference on Natural Language Processing and The 4th Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics

Large language models (LLMs) have the potential to be powerful instruments for psychotherapy. However, there is a shortage of practical tools to support their use in production. We develop a novel, iterative process of updating conversational context for tracking EIS (Emotional Intelligence Scale) instantaneously, and test Llama-70b. Through this, we show that (1) EIS varies more on psychotherapeutic (emotional support) conversations than control (emotionally unstimulating) conversations and (2) model responses can be systematically classified to identify consistent patterns. Thus, EIS is a valid indicator of empathetic model behavior. Rises in the EIS score correspond to prosocial behavior, and falls correspond to detached, unsocial behavior. These results suggest that psychometric questionnaires like EIS can provide a structured lens for observing empathetic stability of models and offer a foundation for future work on their role in psychotherapy.

pdf bib
Linear Relational Decoding of Morphology in Language Models
Eric Xia | Jugal Kalita
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)

A two-part affine approximation has been found to be a good approximation for transformer computations over certain subject-object relations. Adapting the Bigger Analogy Test Set, we show that the linear transformation W s , where s is a middle-layer representation of a subject token and W is derived from model derivatives, can accurately reproduce final object states for many relations. This linear technique achieves 90% faithfulness on morphological relations, with similar findings across languages and models. Our results suggest that some conceptual relationships in language models, such as morphology, are readily interpretable from latent space and are sparsely encoded by cross-layer linear transformations.

pdf bib
Beyond the Haystack: Sensitivity to Context in Legal Reference Recall
Eric Xia | Karthik Srikumar | Keshav Karthik | Advaith Renjith | Ashwinee Panda
Proceedings of the Natural Legal Language Processing Workshop 2025

Reference retrieval is critical for many applications in the legal domain, for instance in determining which case texts support a particular claim. However, existing benchmarking methods do not rigorously enable evaluation of recall capabilities in previously unseen contexts. We develop an evaluation framework from U.S. court opinions which ensures models have no prior knowledge of case results or context. Applying our framework, we identify an consistent gap across models and tasks between traditional needle-in-a-haystack retrieval and actual performance in legal recall. Our work shows that standard needle-in-a-haystack benchmarks consistently overestimate recall performance in the legal domain. By isolating the causes of performance degradation to contextual informativity rather than distributional differences, our findings highlight the need for specialized testing in reference-critical applications, and establish an evaluation framework for improving retrieval across informativity levels.