Zixiao Wang

2026

When reading foreign-language literature, non-native users often face significant challenges. Existing traditional machine translation systems tend to obscure or mistranslate key terminology, while paraphrasing aimed at lay readers often oversimplifies it, thereby hindering their ability to master domain-specific technical vocabulary. To bridge this gap, we first define a novel task, Glossing-Oriented Academic Translation (GOAT), which aims to produce translations dynamically adapted to a reader’s academic proficiency, or level. We then propose GlossaGen, a comprehensive framework to address this task. GlossaGen features two key innovations: a multi-agent data synthesis pipeline that leverages academic personas to automatically generate a large-scale, structured dataset with level-specific explanations; and a novel training strategy based on dynamic adapter merging, which balances task generalization with user-level specialization by combining a ”generalist” adapter with a fine-grained ”expert” one. We evaluate GlossaGen on our synthesized benchmark, where results from automatic metrics, large language model (LLM)-based assessments, and human evaluations consistently demonstrate that our approach achieves higher scores than strong baselines across most metrics. Our framework provides a scalable pathway to enhance the comprehensibility of scientific literature for non-native readers, delivering more accurate translations accompanied by pedagogically sound, level-specific term explanations, and we release our code and data to facilitate further research.

2025

pdf bib abs

Previous approaches to persona simulation large language models (LLMs) have typically relied on learning basic biographical information, or using limited role-play dialogue datasets to capture a character’s responses. However, a holistic representation of an individual goes beyond surface-level facts or conversations to deeper thoughts and thinking. In this work, we introduce CharacterBot, a model designed to replicate both the linguistic patterns and distinctive thought patterns as manifested in the textual works of a character. Using Lu Xun, a renowned Chinese writer as a case study, we propose four training tasks derived from his 17 essay collections. These include a pre-training task focused on mastering external linguistic structures and knowledge, as well as three fine-tuning tasks: multiple-choice question answering, generative question answering, and style transfer, each aligning the LLM with Lu Xun’s internal ideation and writing style. To optimize learning across these tasks, we introduce a CharLoRA parameter updating mechanism, where a general linguistic style expert collaborates with other task-specific experts to better study both the language style and the understanding of deeper thoughts. We evaluate CharacterBot on three tasks for linguistic accuracy and opinion comprehension, demonstrating that it significantly outperforms the baselines on our adapted metrics. We hope this work inspires future research on deep character persona simulation LLMs: https://github.com/zxwang63/characterbot

2023

pdf bib abs

ATFormer: A Learned Performance Model with Transfer Learning Across Devices for Deep Learning Tensor Programs
Yang Bai | Wenqian Zhao | Shuo Yin | Zixiao Wang | Bei Yu
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The training and inference efficiency of ever-larger deep neural networks highly rely on the performance of tensor operators on specific hardware platforms. Therefore, a compilation-based optimization flow with automatic tensor generation and parameter tuning is necessary for efficient model deployment. While compilation-based methods with performance models can provide dynamic and suitable code optimization, they suffer from a large design space exploration with rough measurement accuracy and poor transferability among different hardware platforms. This paper presents ATFormer, a simple yet efficient design with attention-inspired modules to accurately predict the performance of optimized operators by capturing global and long-range dependencies within a complete scheduling space. Compared with state-of-the-arts, ATFormer can predict the optimal implementation of tensor operators to reduce inference time with minimal effort on modern DNN benchmarks. Furthermore, ATFormer with pre-trained parameters can quickly adapt to different workloads and hardware via transfer learning.

Co-authors

Bei Yu 1

Venues

Findings2
EMNLP1

Fix author