CEMT:Controllable Element-Oriented Machine Translation via Structured Linguistic Reasoning
Lingling Shi, Haoyu Jin, Ruiyu Fang, Shuangyong Song, Jinsong Su, Yongxiang Li, Xuelong Li
Abstract
Large Language Models have shown strong performance in Machine Translation, yet they often suffer from paraphrasing errors, omissions, or hallucinations when the input contains translation-specific elements (e.g., URLs, slang, and idioms) that require strict preservation or controlled transformation, undermining the reliability of critical details.We propose CEMT, a Controllable Element-Oriented Machine Translation framework inspired by the analysis–strategy–generation paradigm in human translation. CEMT first employs an Element Detection Module to identify translation-specific elements, and then introduces a Translation Module that decomposes the translation process into linguistically grounded analysis, strategy formulation, and final generation, thereby guiding the reliable translation of these elements. We further introduce a CoT Judge model during training that provides step-wise supervision over the accuracy and consistency of the translation process.On the WMT23/24 Chinese–English benchmarks, CEMT improves performance over existing Machine Translation models while significantly reducing element-level constraint violations.- Anthology ID:
- 2026.findings-acl.1882
- Volume:
- Findings of the Association for Computational Linguistics: ACL 2026
- Month:
- July
- Year:
- 2026
- Address:
- San Diego, California, United States
- Editors:
- Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 37755–37781
- Language:
- URL:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1882/
- DOI:
- Cite (ACL):
- Lingling Shi, Haoyu Jin, Ruiyu Fang, Shuangyong Song, Jinsong Su, Yongxiang Li, and Xuelong Li. 2026. CEMT:Controllable Element-Oriented Machine Translation via Structured Linguistic Reasoning. In Findings of the Association for Computational Linguistics: ACL 2026, pages 37755–37781, San Diego, California, United States. Association for Computational Linguistics.
- Cite (Informal):
- CEMT:Controllable Element-Oriented Machine Translation via Structured Linguistic Reasoning (Shi et al., Findings 2026)
- PDF:
- https://preview.aclanthology.org/ingest-acl/2026.findings-acl.1882.pdf