Yifan Wu

Other people with similar names: Yifan Wu

Unverified author pages with similar names: Yifan Wu

2026

Human experts tackle difficult math problems by identifying and executing a few pivotal steps rather than listing every intermediate thought. In contrast, standard Chain-of-Thought (CoT) distillation trains small models on lengthy reasoning traces, encouraging a uniform overthinking style across easy and hard items alike. The result is rigid, slow solutions that sacrifice adaptivity. This approach stands in sharp contrast to human intuition. Humans naturally adapt their problem-solving strategy, dedicating significant effort to difficult problems while finding quick, simple solutions for easier ones. We argue that the root cause lies in the training data: it contains excess information and reasoning steps organized in ways misaligned with human practice. We address this with Difficulty-Aware Distillation(DAD), a procedure for producing training data that mirrors concise human reasoning. A large teacher model first assesses a problem’s difficulty and then rewrites the solution to retain only the essential steps. Using this process, we constructed LiteCoT, a 100,000-example corpus of short, clear rationales, and used it to train our Liter models. With 100k LiteCoT, we outperform models trained on 800k long CoT and cut both training and inference costs. The advantage is consistent across standard math benchmarks, showing that concise, human-aligned data delivers equal or better accuracy with much less compute. For example, on the challenging AIME24 exam, our approach reaches 74.2% Pass@1 using only about 5K inference tokens, surpassing other methods that consume many more tokens.

pdf bib abs

Generative engines (GEs) are reshaping information access by replacing ranked links with citation-grounded answers, yet current Generative Engine Optimization (GEO) methods optimize each instance in isolation, unable to accumulate or transfer effective strategies across tasks and engines. We reframe GEO as a strategy learning problem and propose MAGEO, a multi-agent framework in which coordinated planning, editing, and fidelity-aware evaluation serve as the execution layer, while validated editing patterns are progressively distilled into reusable, engine-specific optimization skills. To enable controlled assessment, we introduce a Twin Branch Evaluation Protocol for causal attribution of content edits and DSV-CF, a dual-axis metric that unifies semantic visibility with attribution accuracy. We further release MSME-GEO-Bench, a multi-scenario, multi-engine benchmark grounded in real-world queries. Experiments on three mainstream engines show that MAGEO substantially outperforms heuristic baselines in both visibility and citation fidelity, with ablations confirming that engine-specific preference modeling and strategy reuse are central to these gains, suggesting a scalable learning-driven paradigm for trustworthy GEO. Code is available at https://github.com/Wu-beining/MAGEO.