Jianming Xu
2025
AgentCPM-GUI: Building Mobile-Use Agents with Reinforcement Fine-Tuning
Zhong Zhang
|
Yaxi Lu
|
Yikun Fu
|
Yupeng Huo
|
Shenzhi Yang
|
Yesai Wu
|
Han Si
|
Xin Cong
|
Haotian Chen
|
Yankai Lin
|
Xie Xie
|
Wei Zhou
|
Wang Xu
|
Zhou Su
|
Zhongwu Zhai
|
Xiaoming Liu
|
Meiyudong
|
Jianming Xu
|
Hongyan Tian
|
Chongyi Wang
|
Chi Chen
|
Yuan Yao
|
Zhiyuan Liu
|
Maosong Sun
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations
Large language model agents have enabled GUI-based automation, particularly for mobile devices. However, deployment remains limited by noisy data, poor generalization, and lack of support for non-English GUIs. In this work, we present AgentCPM-GUI, an 8B-parameter GUI agent built for robust and efficient on-device GUI interaction. Our training pipeline includes grounding-aware pre-training to enhance perception, supervised fine-tuning on high-quality Chinese and English trajectories to imitate human-like actions, and reinforcement fine-tuning with GRPO to improve reasoning capability. AgentCPM-GUI achieves promising performance on five public benchmarks and our proposed Chinese benchmark CAGUI. To facilitate reproducibility and further research, we publicly release all code, model checkpoint, and evaluation data at: https://github.com/OpenBMB/AgentCPM-GUI
Search
Fix author
Co-authors
- Haotian Chen 1
- Chi Chen 1
- Xin Cong 1
- Yikun Fu 1
- Yupeng Huo 1
- show all...