Guoming Li
2026
GlossaGen: Making Academic Translation Smarter with Glossing
Zixiao Wang | Duzhen Zhang | Juntian Zhang | Yuhan Liu | Guoming Li | Haolun Wu | Le Song | Xiuying Chen
Findings of the Association for Computational Linguistics: ACL 2026
Zixiao Wang | Duzhen Zhang | Juntian Zhang | Yuhan Liu | Guoming Li | Haolun Wu | Le Song | Xiuying Chen
Findings of the Association for Computational Linguistics: ACL 2026
When reading foreign-language literature, non-native users often face significant challenges. Existing traditional machine translation systems tend to obscure or mistranslate key terminology, while paraphrasing aimed at lay readers often oversimplifies it, thereby hindering their ability to master domain-specific technical vocabulary. To bridge this gap, we first define a novel task, Glossing-Oriented Academic Translation (GOAT), which aims to produce translations dynamically adapted to a reader’s academic proficiency, or level. We then propose GlossaGen, a comprehensive framework to address this task. GlossaGen features two key innovations: a multi-agent data synthesis pipeline that leverages academic personas to automatically generate a large-scale, structured dataset with level-specific explanations; and a novel training strategy based on dynamic adapter merging, which balances task generalization with user-level specialization by combining a ”generalist” adapter with a fine-grained ”expert” one. We evaluate GlossaGen on our synthesized benchmark, where results from automatic metrics, large language model (LLM)-based assessments, and human evaluations consistently demonstrate that our approach achieves higher scores than strong baselines across most metrics. Our framework provides a scalable pathway to enhance the comprehensibility of scientific literature for non-native readers, delivering more accurate translations accompanied by pedagogically sound, level-specific term explanations, and we release our code and data to facilitate further research.