Lujia Yang
2026
SAME: Signer-Aware Mixture-of-Experts for Test-Time Adaptation in Sign Language Translation
Lujia Yang | Weicai Yan | Yongbo He | Qifei Zhang | Tao Jin | Jinshan Zhang | Meng Xi | Jianwei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Lujia Yang | Weicai Yan | Yongbo He | Qifei Zhang | Tao Jin | Jinshan Zhang | Meng Xi | Jianwei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sign language translation (SLT) is essential for bridging communication between the deaf and hearing communities, but real-world deployment suffers from domain shift such as signer variability, lighting, and background changes. Supervised fine-tuning is impractical due to limited labeled data, and existing unsupervised adaptation methods require batch statistics or long adaptation. We introduce Test-Time Adaptation (TTA) for SLT, enabling rapid adaptation to domain shift without the need for labeled data. To the best of our knowledge, this is the first study to explore TTA in SLT. Existing TTA methods predominantly focus on image classification tasks and lack a comprehensive strategy for handling domain shift in SLT. In response, we introduce SAME, a plug-and-play, signer-aware Mixture-of-Experts (MoE) TTA architecture for SLT. SAME inserts lightweight MoE modules after multiple encoder layers. Gates are conditioned on signer features and stabilized with unsupervised regularizers, effectively decoupling domain shift across encoder depths while enabling personalized adaptation. Experiments show that SAME outperforms existing TTA methods and can enhance the capabilities of multiple SLT models.
2025
PACHAT: Persona-Aware Speech Assistant for Multi-party Dialogue
Dongjie Fu | Xize Cheng | Linjun Li | Xiaoda Yang | Lujia Yang | Tao Jin
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Dongjie Fu | Xize Cheng | Linjun Li | Xiaoda Yang | Lujia Yang | Tao Jin
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Extensive research on LLM-based spoken dialogue systems has significantly advanced the development of intelligent voice assistants. However, the integration of role information within speech remains an underexplored area, limiting its application in real-world scenarios, particularly in multi-party dialogue settings. With the growing demand for personalization, voice assistants that can recognize and remember users establish a deeper connection with them. We focus on enabling LLMs with speaker-awareness capabilities and enhancing their understanding of character settings through synthetic data to generate contextually appropriate responses. We introduce Persona-Dialogue, the first large-scale multi-party spoken dialogue dataset that incorporates speaker profiles. Based on this dataset, we propose PAChat, an architecture that simultaneously models both linguistic content and speaker features, allowing LLMs to map character settings to speaker identities in speech. Through extensive experiments, we demonstrate that PAChat successfully achieves speaker-specific responses, character understanding, and the generation of targeted replies in multi-party dialogue scenarios, surpassing existing spoken dialogue systems.