Ying Qian
2026
APEX: Learning Adaptive Priorities for Multi-Objective Alignment in Vision-Language Generation
Dongliang Chen | Xinlin Zhuang | Junjie Xu | Luojian Xie | Zehui Wang | Jiaxi Zhuang | Haolin Yang | Liang Dou | Xiao He | Xingjiao Wu | Ying Qian
Findings of the Association for Computational Linguistics: ACL 2026
Dongliang Chen | Xinlin Zhuang | Junjie Xu | Luojian Xie | Zehui Wang | Jiaxi Zhuang | Haolin Yang | Liang Dou | Xiao He | Xingjiao Wu | Ying Qian
Findings of the Association for Computational Linguistics: ACL 2026
Multi-objective alignment for text-to-image generation is commonly implemented via static linear scalarization, but fixed weights often fail under heterogeneous rewards, leading to optimization imbalance where models overfit high-variance, high-responsiveness objectives (e.g., OCR) while under-optimizing perceptual goals. We identify two mechanistic causes: variance hijacking, where reward dispersion induces implicit reweighting that dominates the normalized training signal, and gradient conflicts, where competing objectives produce opposing update directions and trigger seesaw-like oscillations. We propose APEX (Adaptive Priority-based Efficient X-objective Alignment), which stabilizes heterogeneous rewards with Dual-Stage Adaptive Normalization and dynamically schedules objectives via 𝒫3 Adaptive Priorities that combine learning potential, conflict penalty, and progress need. On Stable Diffusion 3.5, APEX achieves improved Pareto trade-offs across four heterogeneous objectives, with balanced gains of +1.31 PickScore, +0.35 DeQA, and +0.53 Aesthetics while maintaining competitive OCR accuracy, mitigating the instability of multi-objective alignment.
2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang | Jiahui Peng | Ren Ma | Yinfan Wang | Tianyi Bai | Xingjian Wei | Qiu Jiantao | Chi Zhang | Ying Qian | Conghui He
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Xinlin Zhuang | Jiahui Peng | Ren Ma | Yinfan Wang | Tianyi Bai | Xingjian Wei | Qiu Jiantao | Chi Zhang | Ying Qian | Conghui He
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The composition of pre-training datasets for large language models (LLMs) remains largely undisclosed, hindering transparency and efforts to optimize data quality—a critical driver of model performance. Current data selection methods, such as natural language quality assessments, diversity-based filters, and classifier-based approaches, are limited by single-dimensional evaluation or redundancy-focused strategies. To address these gaps, we propose four dimensions to evaluate data quality: professionalism, readability, reasoning, and cleanliness. We further introduce Meta-rater, a multi-dimensional data selection method that integrates these dimensions with existing quality metrics through learned optimal weightings. Meta-rater employs proxy models to train a regression model that predicts validation loss, enabling the identification of optimal combinations of quality scores. Experiments demonstrate that Meta-rater doubles convergence speed for 1.3B parameter models and improves downstream task performance by 3.23%, with advantages that scale to models as large as 7.2B parameters. Our work establishes that holistic, multi-dimensional quality integration significantly outperforms conventional single-dimension approaches, offering a scalable paradigm for enhancing pre-training efficiency and model capability. To advance future research, we release scripts, data, and models at https://github.com/opendatalab/Meta-rater.