Yongbo He
2026
SAME: Signer-Aware Mixture-of-Experts for Test-Time Adaptation in Sign Language Translation
Lujia Yang | Weicai Yan | Yongbo He | Qifei Zhang | Tao Jin | Jinshan Zhang | Meng Xi | Jianwei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Lujia Yang | Weicai Yan | Yongbo He | Qifei Zhang | Tao Jin | Jinshan Zhang | Meng Xi | Jianwei Yin
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sign language translation (SLT) is essential for bridging communication between the deaf and hearing communities, but real-world deployment suffers from domain shift such as signer variability, lighting, and background changes. Supervised fine-tuning is impractical due to limited labeled data, and existing unsupervised adaptation methods require batch statistics or long adaptation. We introduce Test-Time Adaptation (TTA) for SLT, enabling rapid adaptation to domain shift without the need for labeled data. To the best of our knowledge, this is the first study to explore TTA in SLT. Existing TTA methods predominantly focus on image classification tasks and lack a comprehensive strategy for handling domain shift in SLT. In response, we introduce SAME, a plug-and-play, signer-aware Mixture-of-Experts (MoE) TTA architecture for SLT. SAME inserts lightweight MoE modules after multiple encoder layers. Gates are conditioned on signer features and stabilized with unsupervised regularizers, effectively decoupling domain shift across encoder depths while enabling personalized adaptation. Experiments show that SAME outperforms existing TTA methods and can enhance the capabilities of multiple SLT models.
Generative-to-Discriminative Test-Time Adaptation via Manifold-Aware Diffusion and Bayesian Distillation
Boyun Zhang | Zequn Xie | Fangming Feng | Zihan Zhang | Yongbo He | Chuxin Wang | Sihang Cai | Tao Jin | Qifei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Boyun Zhang | Zequn Xie | Fangming Feng | Zihan Zhang | Yongbo He | Chuxin Wang | Sihang Cai | Tao Jin | Qifei Zhang
Findings of the Association for Computational Linguistics: ACL 2026
Multimodal Sentiment Analysis (MSA) models typically suffer significant performance degradation under domain shifts. While Test-Time Adaptation (TTA) aims to mitigate this, existing discriminative approaches often succumb to “confident but wrong” predictions on out-of-distribution samples. Conversely, generative models offer robust calibration but incur prohibitive computational costs. To bridge this gap, we propose GD-Adapt (Generative-Discriminative Adaptation), a novel TTA framework that harmonizes the robustness of generative diffusion models with the efficiency of discriminative regression networks via Bayesian Diffusion Distillation (BDD). Specifically, we introduce Auxiliary Generative Regularization (AGR) during pretraining to enforce manifold-aware feature learning. Extensive experiments across five cross-domain scenarios demonstrate our method’s superiority. For instance, on the challenging MOSI to SIMS shift, GD-Adapt reduces Mean Absolute Error (MAE) from 0.6872 to 0.5673 and boosts binary accuracy by 5.81 percentage points (reaching 57.33%). Notably, in scenarios such as SIMS to MOSI, we achieve an 11.18-point gain over the non-adapted baseline.