Zhangkai Zheng

2026

While large language models show promise in mental healthcare, evaluating their therapeutic competence remains challenging due to the unstructured and longitudinal nature of counseling. We argue that current evaluation paradigms suffer from an unanchored defect, leading to two forms of instability: process drift, where unsteered client simulation wanders away from specific counseling goals, and standard drift, where static pointwise scoring lacks the stability for reliable judgment. To address this, we introduce Ps, a unified framework that calibrates the therapeutic competence of LLMs via trajectory-anchored tournaments. We first anchor the interaction trajectory in simulation, where clients precisely control the fluid consultation process to probe multifaceted capabilities. We then anchor the battle trajectory in judgments through an efficient Swiss-system tournament, utilizing dynamic pairwise battles to yield robust Elo ratings. Beyond ranking, we demonstrate that tournament trajectories can be transformed into credible reward signals, enabling on-policy reinforcement learning to enhance LLMs’ performance. Extensive experiments validate the effectiveness of PsychePass and its strong consistency with human expert judgments.

2025

pdf bib abs

MSG-LLM: A Multi-scale Interactive Framework for Graph-enhanced Large Language Models
Jiayu Ding | Zhangkai Zheng | Benshuo Lin | Yun Xue | Yiping Song
Proceedings of the 31st International Conference on Computational Linguistics

Graph-enhanced large language models (LLMs) leverage LLMs’ remarkable ability to model language and use graph structures to capture topological relationships. Existing graph-enhanced LLMs typically retrieve similar subgraphs to augment LLMs, where the subgraphs carry the entities related to our target and relations among the entities. However, the retrieving methods mainly focus solely on accurately matching subgraphs between our target subgraph and the candidate subgraphs at the same scale, neglecting that the subgraphs with different scales may also share similar semantics or structures. To tackle this challenge, we introduce a graph-enhanced LLM with multi-scale retrieval (MSG-LLM). It captures similar graph structures and semantics across graphs at different scales and bridges the graph alignment across multiple scales. The larger scales maintain the graph’s global information, while the smaller scales preserve the details of fine-grained sub-structures. Specifically, we construct a multi-scale variation to dynamically shrink the scale of graphs. Further, we employ a graph kernel search to discover subgraphs from the entire graph, which essentially achieves multi-scale graph retrieval in Hilbert space. Additionally, we propose to conduct multi-scale interactions (message passing) over graphs at various scales to integrate key information. The interaction also bridges the graph and LLMs, helping with graph retrieval and LLM generation. Finally, we employ a Chain-of-Thought-based LLM prediction to perform the downstream tasks. We evaluate our approach on two graph-based downstream tasks and the experimental results show that our method achieves state-of-the-art performance.

Co-authors

Venues

COLING1
Findings1

Fix author