Guanjun Wang
2026
ACRM: Multi-Agent Trajectory Learning for Automated Credit Risk Model Refreshing in Production
Liangzu Liu | Mengzhe Ruan | Xiaotian Chen | HaonanChen | XudongNiu | Wendi Yuan | YuechenLi | Yang Liu | Guanjun Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Liangzu Liu | Mengzhe Ruan | Xiaotian Chen | HaonanChen | XudongNiu | Wendi Yuan | YuechenLi | Yang Liu | Guanjun Wang
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026)
Credit risk models suffer from rapid performance decay due to distribution shifts, requiring frequent updates to meet strict operational guardrails. However, manual refreshing takes weeks of trial-and-error across upstream data engineering and downstream training. We present ACRM, a deployed multi-agent framework that automates the end-to-end credit modeling workflow by treating it as a learnable trajectory of agent interactions. Unlike AutoML, which optimizes hyperparameters on fixed datasets, ACRM’s action space extends to upstream data semantics—cohort selection, observation windowing, feature screening—where the majority of performance recovery occurs. A central Orchestrator coordinates specialist agents through a three-stream decision stack: rule-based safety guardrails, retrieval-augmented grounding from historical workflows, and preference alignment via DPO on expert-labeled trajectories. Deployed at a major fintech institution for three months across six business scenarios, ACRM reduced the average model refresh cycle from weeks to 1.1 days and iteration rounds by 65%, while maintaining superior stability metrics.