Sirry Chen

2026

SpeechMedAssist: Efficiently and Effectively Adapting Speech Language Models for Medical Consultation
Sirry Chen | Jieyi Wang | Wei Chen | Zhongyu Wei
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Medical consultations are intrinsically speech-centric. However, most prior works focus on long-text-based interactions, which are cumbersome and patient-unfriendly. Recent advances in speech language models (SpeechLMs) have enabled more natural speech-based interaction, yet the scarcity of medical speech data and the inefficiency of directly fine-tuning on speech data jointly hinder the adoption of SpeechLMs in medical consultation. In this paper, we propose SpeechMedAssist, a SpeechLM natively capable of conducting speech-based multi-turn interactions with patients. By exploiting the architectural properties of SpeechLMs, we decouple the conventional one-stage training into a two-stage paradigm consisting of **(1) Knowledge Capability Injection via Text** and **(2) Modality Re-alignment with Limited Speech Data**, thereby reducing the requirement for medical speech data to only **10k** synthesized samples. To evaluate SpeechLMs for medical consultation scenarios, we design a benchmark comprising both single-turn question answering and multi-turn simulated interactions. Experimental results show that our model outperforms all baselines in both effectiveness and robustness in most evaluation settings.

2024

pdf bib abs

Social media bot detection is increasingly crucial with the rise of social media platforms. Existing methods predominantly construct social networks as graph and utilize graph neural networks (GNNs) for bot detection. However, most of these methods focus on how to improve the performance of GNNs while neglecting the community structure within social networks. Moreover, GNNs based methods still face problems such as poor model generalization due to the relatively small scale of the dataset and over-smoothness caused by information propagation mechanism. To address these problems, we propose the Community-Aware Heterogeneous Graph Contrastive Learning framework (i.e., CACL), which constructs social network as heterogeneous graph with multiple node types and edge types, and then utilizes community-aware module to mine both hard positive samples and hard negative samples for supervised graph contrastive learning with adaptive graph enhancement algorithms. Extensive experiments demonstrate that our framework addresses the previously mentioned challenges and outperforms competitive baselines on three social media bot benchmarks.

Co-authors

Jieyi Wang 1

Zhongyu Wei (魏忠钰) 1

Chen-Chen Zong 1

Venues

ACL1
Findings1

Fix author