Abudurexiti Reheman
2026
Synergizing Semantic Anchors and Ordinal Smoothed Cross-Entropy for Speech Fluency Classification
Mulati Kahaer | Sirajahmat Ruzmamat | XuDong Pang | Subinuer Maimaitituerxun | Zaokere Kadeer | Abudurexiti Reheman | Wenwen Lu | Panpan Zheng | Aishan Wumaier
Findings of the Association for Computational Linguistics: ACL 2026
Mulati Kahaer | Sirajahmat Ruzmamat | XuDong Pang | Subinuer Maimaitituerxun | Zaokere Kadeer | Abudurexiti Reheman | Wenwen Lu | Panpan Zheng | Aishan Wumaier
Findings of the Association for Computational Linguistics: ACL 2026
Speech fluency is a core indicator of second language proficiency and a critical component of Computer-Assisted Pronunciation Training (CAPT) systems. Accurate assessment requires models to perceive both macroscopic speech flow trends and microscopic local anomalies. However, existing methods struggle to bridge the semantic gap between static expert priors and dynamic temporal representations, while often overlooking the inherent ordinal nature of fluency scores. To address these challenges, we first construct a set of expert features targeting fluency disruptions and rhythmic regularity to provide explicit linguistic priors. Building on this, we propose the Multimodal Multi-Stream Fusion Classification (MMSFC) network. It employs a Mutual Cross-Attention (MCA) mechanism that leverages these expert features as “semantic anchors” to actively guide Whisper’s temporal representations and integrate decoder contexts, achieving deep interaction between global priors and local dynamics. Furthermore, we propose the Ordinal Smoothed Cross-Entropy (OSCE) loss. By constructing distance-aware soft target distributions coupled with confidence-adaptive smoothing and boundary enhancement, OSCE explicitly models ordinal relationships to resolve boundary ambiguity. Experiments on SpeechOcean762 show MMSFC achieves 83.40% accuracy, significantly outperforming strong baselines. Notably, OSCE also demonstrates superior generalization potential in cross-domain CV and NLP tasks. Our code is available at https://github.com/speech26ai/MMSFCCode.
2025
Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation
Abudurexiti Reheman | Hongyu Liu | Junhao Ruan | Abudukeyumu Abudula | Yingfeng Luo | Tong Xiao | JingBo Zhu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Abudurexiti Reheman | Hongyu Liu | Junhao Ruan | Abudukeyumu Abudula | Yingfeng Luo | Tong Xiao | JingBo Zhu
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Neural machine translation (NMT) has advanced significantly, yet challenges remain in adapting to new domains . In scenarios where bilingual data is limited, this issue is further exacerbated. To address this, we propose kNN-LM-NMT, a method that leverages semantically similar target language sentences in the kNN framework. Our approach generates a probability distribution over these sentences during decoding, and this distribution is then interpolated with the NMT model’s distribution. Additionally, we introduce an n-gram-based approach to focus on similar fragments, enabling the model to avoid the noise introduced by the non-similar parts. To enhance accuracy, we further incorporate cross-lingual retrieval similarity to refine the kNN probability distribution. Extensive experiments on multi-domain datasets demonstrate significant performance improvements in both high-resource and low-resource scenarios. Our approach effectively extracts translation knowledge from limited target domain data, and well benefits from large-scale monolingual data for robust context representation.
2024
Exploiting Target Language Data for Neural Machine Translation Beyond Back Translation
Abudurexiti Reheman | Yingfeng Luo | Junhao Ruan | Chunliang Zhang | Anxiang Ma | Tong Xiao | JingBo Zhu
Findings of the Association for Computational Linguistics: ACL 2024
Abudurexiti Reheman | Yingfeng Luo | Junhao Ruan | Chunliang Zhang | Anxiang Ma | Tong Xiao | JingBo Zhu
Findings of the Association for Computational Linguistics: ACL 2024
Neural Machine Translation (NMT) encounters challenges when translating in new domains and low-resource languages. To address these issues, researchers have proposed methods to integrate additional knowledge into NMT, such as translation memories (TMs). However, finding TMs that closely match the input sentence remains challenging, particularly in specific domains. On the other hand, monolingual data is widely accessible in most languages, and back-translation is seen as a promising approach for utilizing target language data. Nevertheless, it still necessitates additional training. In this paper, we introduce Pseudo-kNN-MT, a variant of k-nearest neighbor machine translation (kNN-MT) that utilizes target language data by constructing a pseudo datastore. Furthermore, we investigate the utility of large language models (LLMs) for the kNN component. Experimental results demonstrate that our approach exhibits strong domain adaptation capability in both high-resource and low-resource machine translation. Notably, LLMs are found to be beneficial for robust NMT systems.
2023
Augmenting Large Language Model Translators via Translation Memories
Yongyu Mu | Abudurexiti Reheman | Zhiquan Cao | Yuchun Fan | Bei Li | Yinqiao Li | Tong Xiao | Chunliang Zhang | Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023
Yongyu Mu | Abudurexiti Reheman | Zhiquan Cao | Yuchun Fan | Bei Li | Yinqiao Li | Tong Xiao | Chunliang Zhang | Jingbo Zhu
Findings of the Association for Computational Linguistics: ACL 2023
Using translation memories (TMs) as prompts is a promising approach to in-context learning of machine translation models. In this work, we take a step towards prompting large language models (LLMs) with TMs and making them better translators. We find that the ability of LLMs to “understand” prompts is indeed helpful for making better use of TMs. Experiments show that the results of a pre-trained LLM translator can be greatly improved by using high-quality TM-based prompts. These results are even comparable to those of the state-of-the-art NMT systems which have access to large-scale in-domain bilingual data and are well tuned on the downstream tasks.
2020
The NiuTrans Machine Translation Systems for WMT20
Yuhao Zhang | Ziyang Wang | Runzhe Cao | Binghao Wei | Weiqiao Shan | Shuhan Zhou | Abudurexiti Reheman | Tao Zhou | Xin Zeng | Laohu Wang | Yongyu Mu | Jingnan Zhang | Xiaoqian Liu | Xuanjun Zhou | Yinqiao Li | Bei Li | Tong Xiao | Jingbo Zhu
Proceedings of the Fifth Conference on Machine Translation
Yuhao Zhang | Ziyang Wang | Runzhe Cao | Binghao Wei | Weiqiao Shan | Shuhan Zhou | Abudurexiti Reheman | Tao Zhou | Xin Zeng | Laohu Wang | Yongyu Mu | Jingnan Zhang | Xiaoqian Liu | Xuanjun Zhou | Yinqiao Li | Bei Li | Tong Xiao | Jingbo Zhu
Proceedings of the Fifth Conference on Machine Translation
This paper describes NiuTrans neural machine translation systems of the WMT20 news translation tasks. We participated in Japanese<->English, English->Chinese, Inuktitut->English and Tamil->English total five tasks and rank first in Japanese<->English both sides. We mainly utilized iterative back-translation, different depth and widen model architectures, iterative knowledge distillation and iterative fine-tuning. And we find that adequately widened and deepened the model simultaneously, the performance will significantly improve. Also, iterative fine-tuning strategy we implemented is effective during adapting domain. For Inuktitut->English and Tamil->English tasks, we built multilingual models separately and employed pretraining word embedding to obtain better performance.
Search
Fix author
Co-authors
- Tong Xiao (肖桐) 4
- JingBo Zhu (朱靖波) 4
- Bei Li 2
- Yinqiao Li 2
- Yingfeng Luo 2
- Yongyu Mu 2
- Junhao Ruan 2
- Chunliang Zhang 2
- Abudukeyumu Abudula 1
- Runzhe Cao 1
- Zhiquan Cao 1
- Yuchun Fan 1
- Zaokere Kadeer 1
- Mulati Kahaer 1
- Hongyu Liu 1
- Xiaoqian Liu 1
- Wenwen Lu 1
- Anxiang Ma 1
- Subinuer Maimaitituerxun 1
- XuDong Pang 1
- Sirajahmat Ruzmamat 1
- Weiqiao Shan 1
- Laohu Wang 1
- Ziyang Wang 1
- Binghao Wei 1
- Aishan Wumaier 1
- Xin Zeng 1
- Jingnan Zhang 1
- Yuhao Zhang 1
- Panpan Zheng 1
- Shuhan Zhou 1
- Tao Zhou 1
- Xuanjun Zhou 1