Byung-Hoon Kim
2026
Anonpsy: A Graph-Based Framework for Structure-Preserving De-identification of Psychiatric Narratives
Kyungho Lim | Byung-Hoon Kim
Findings of the Association for Computational Linguistics: ACL 2026
Kyungho Lim | Byung-Hoon Kim
Findings of the Association for Computational Linguistics: ACL 2026
Psychiatric narratives encode patient identity not only through explicit identifiers but also through idiosyncratic life events embedded in clinical structure. Existing de-identification approaches, including PHI masking and LLM-based synthetic rewriting, operate at the text level and offer limited control over which semantic elements are preserved or altered. We introduce Anonpsy, a de-identification framework that reformulates the task as graph-guided semantic rewriting. Anonpsy (1) converts each narrative into a semantic graph encoding clinical entities, temporal anchors, and typed relations; (2) applies graph-constrained perturbations that modify identifying context while preserving clinical structure; and (3) regenerates text via graph-conditioned LLM generation. Evaluated on 90 clinician-authored psychiatric case narratives, Anonpsy preserves diagnostic fidelity while achieving consistently low re-identification risk under expert, semantic, and GPT-5-based evaluations. Compared with a strong LLM-only rewriting baseline, Anonpsy yields substantially lower semantic similarity and identifiability. These results demonstrate that explicit structural representations combined with constrained generation provide an effective approach to de-identification for psychiatric narratives.
2024
ERD: A Framework for Improving LLM Reasoning for Cognitive Distortion Classification
Sehee Lim | Yejin Kim | Chi-Hyun Choi | Jy-yong Sohn | Byung-Hoon Kim
Proceedings of the 6th Clinical Natural Language Processing Workshop
Sehee Lim | Yejin Kim | Chi-Hyun Choi | Jy-yong Sohn | Byung-Hoon Kim
Proceedings of the 6th Clinical Natural Language Processing Workshop
Improving the accessibility of psychotherapy with the aid of Large Language Models (LLMs) is garnering a significant attention in recent years. Recognizing cognitive distortions from the interviewee’s utterances can be an essential part of psychotherapy, especially for cognitive behavioral therapy. In this paper, we propose ERD, which improves LLM-based cognitive distortion classification performance with the aid of additional modules of (1) extracting the parts related to cognitive distortion, and (2) debating the reasoning steps by multiple agents. Our experimental results on a public dataset show that ERD improves the multi-class F1 score as well as binary specificity score. Regarding the latter score, it turns out that our method is effective in debiasing the baseline method which has high false positive rate, especially when the summary of multi-agent debate is provided to LLMs.