Bingxin Qu


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Shy-hunyuan-MT at WMT25 General Machine Translation Shared Task
Mao Zheng | Zheng Li | Yang Du | Bingxin Qu | Mingyang Song
Proceedings of the Tenth Conference on Machine Translation

In this paper, we present our submission to the WMT25 shared task on machine translation, for which we propose Synergy-enhanced policy optimization framework, named Shy. This novel two-phase training framework synergistically combines knowledge distillation and fusion via reinforcement learning.In the first phase, we introduce a multi-stage training framework that harnesses the complementary strengths of multiple state-of-the-art large language models to generate diverse, high-quality translation candidates. These candidates serve as pseudo-references to guide the supervised fine-tuning of our model, Hunyuan-7B, effectively distilling the collective knowledge of multiple expert systems into a single efficient model.In the second phase, we further refine the distilled model through Group Relative Policy Optimization, a reinforcement learning technique that employs a composite reward function. By calculating reward from multiple perspectives, our model ensures better alignment with human preferences and evaluation metrics.Extensive experiments across multiple language pairs demonstrate that our model Shy-hunyuan-MT yields substantial improvements in translation quality compared to baseline approaches. Notably, our framework achieves competitive performance comparable to that of state-of-the-art systems while maintaining computational efficiency through knowledge distillation and strategic ensemble.