Yuxuan Hu
Papers on this page may belong to the following people: Yuxuan Hu, Yuxuan Hu
2026
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning
Weikang Shi | Aldrich Yu | Rongyao Fang | Houxing Ren | Ke Wang | Aojun Zhou | Changyao Tian | Xinyu Fu | Yuxuan Hu | Zimu Lu | Linjiang Huang | Si Liu | Rui Liu | Hongsheng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Weikang Shi | Aldrich Yu | Rongyao Fang | Houxing Ren | Ke Wang | Aojun Zhou | Changyao Tian | Xinyu Fu | Yuxuan Hu | Zimu Lu | Linjiang Huang | Si Liu | Rui Liu | Hongsheng Li
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
While Large Language Models (LLMs) have excelled in textual reasoning, they struggle with mathematical domains like geometry that intrinsically rely on visual aids. Existing approaches to Visual Chain-of-Thought (VCoT) are often limited by rigid external tools or fail to generate the high-fidelity, strategically-timed diagrams necessary for complex problem-solving. To bridge this gap, we introduce MathCanvas, a comprehensive framework designed to endow unified Large Multimodal Models (LMMs) with intrinsic VCoT capabilities for mathematics. Our approach consists of two phases. First, a Visual Manipulation stage pre-trains the model on a novel 15.2M-pair corpus, comprising 10M caption-to-diagram pairs (MathCanvas-Imagen) and 5.2M step-by-step editing trajectories (MathCanvas-Edit), to master diagram generation and editing. Second, a Strategic Visual-Aided Reasoning stage fine-tunes the model on MathCanvas-Instruct, a new 219K-example dataset of interleaved visual-textual reasoning paths, teaching it when and how to leverage visual aids. To facilitate rigorous evaluation, we introduce MathCanvas-Bench, a challenging benchmark with 3K problems that require models to produce interleaved visual-textual solutions. Our model, BAGEL-Canvas, trained under this framework, achieves an 86% relative improvement over strong LMM baselines on MathCanvas-Bench, demonstrating excellent generalization to other public math benchmarks. Our work provides a complete toolkit—framework, datasets, and benchmark—to unlock complex, human-like visual reasoning in LMMs.
2025
LM-Searcher: Cross-domain Neural Architecture Search with LLMs via Unified Numerical Encoding
Yuxuan Hu | Jihao Liu | Ke Wang | Jinliang Zheng | Weikang Shi | Manyuan Zhang | Qi Dou | Rui Liu | Aojun Zhou | Hongsheng Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Yuxuan Hu | Jihao Liu | Ke Wang | Jinliang Zheng | Weikang Shi | Manyuan Zhang | Qi Dou | Rui Liu | Aojun Zhou | Hongsheng Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Recent progress in Large Language Models (LLMs) has opened new avenues for solving complex optimization problems, including Neural Architecture Search (NAS). However, existing LLM-driven NAS approaches rely heavily on prompt engineering and domain-specific tuning, limiting their practicality and scalability across diverse tasks. In this work, we propose LM-Searcher, a novel framework that leverages LLMs for cross-domain neural architecture optimization without the need for extensive domain-specific adaptation. Central to our approach is NCode, a universal numerical string representation for neural architectures, which enables cross-domain architecture encoding and search. We also reformulate the NAS problem as a ranking task, training LLMs to select high-performing architectures from candidate pools using instruction-tuning samples derived from a novel pruning-based subspace sampling strategy. Our curated dataset, encompassing a wide range of architecture-performance pairs, encourages robust and transferable learning. Comprehensive experiments demonstrate that LM-Searcher achieves competitive performance in both in-domain (e.g., CNNs for image classification) and out-of-domain (e.g., LoRA configurations for segmentation and generation) tasks, establishing a new paradigm for flexible and generalizable LLM-based architecture search.
2024
SP3: Enhancing Structured Pruning via PCA Projection
Yuxuan Hu | Jing Zhang | Zhe Zhao | Chen Zhao | Xiaodong Chen | Cuiping Li | Hong Chen
Findings of the Association for Computational Linguistics: ACL 2024
Yuxuan Hu | Jing Zhang | Zhe Zhao | Chen Zhao | Xiaodong Chen | Cuiping Li | Hong Chen
Findings of the Association for Computational Linguistics: ACL 2024
2023
A Generation-based Deductive Method for Math Word Problems
Yuxuan Hu | Jing Zhang | Haoyang Li | Cuiping Li | Hong Chen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Yuxuan Hu | Jing Zhang | Haoyang Li | Cuiping Li | Hong Chen
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Math word problems (MWP) involving advanced operators such as linear equation solver cannot be easily tackled by earlier MWP methods, because the existing generation methods suffer from repeated sub-expression generation and deductive methods are restricted to dealing with binary operations. This paper propose a new multivariate directed acyclic graph (mDAG) as an alternative to the generation methods’ binary expression tree or the deductive methods’ binary directed acyclic graph. Then to produce the topological ordering of mDAG, we propose a generation-based deductive (GeDe) model, which equips a generation model with a re-encoder to keep the deductive property but avoid the expensive enumeration of the deductive methods. GeDe performs well on math problems with many operators on the widely used benchmarks as well as solving multivariate operators on our own CMWPA benchmark. Our code is available at https://github.com/hyx1999/GeDe
2008
A Cascaded Syntactic and Semantic Dependency Parsing System
Wanxiang Che | Zhenghua Li | Yuxuan Hu | Yongqiang Li | Bing Qin | Ting Liu | Sheng Li
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning
Wanxiang Che | Zhenghua Li | Yuxuan Hu | Yongqiang Li | Bing Qin | Ting Liu | Sheng Li
CoNLL 2008: Proceedings of the Twelfth Conference on Computational Natural Language Learning
2007
HIT-IR-WSD: A WSD System for English Lexical Sample Task
Yuhang Guo | Wanxiang Che | Yuxuan Hu | Wei Zhang | Ting Liu
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
Yuhang Guo | Wanxiang Che | Yuxuan Hu | Wei Zhang | Ting Liu
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)
2005
Search
Fix author
Co-authors
- Wanxiang Che (车万翔) 3
- Ting Liu 3
- Hong Chen 2
- Cuiping Li 2
- Hongsheng Li 2
- Sheng Li 2
- Weikang Shi 2
- Ke Wang 2
- Jing Zhang 2
- Aojun Zhou 2
- Xiaodong Chen 1
- Qi Dou 1
- Rongyao Fang 1
- Xinyu Fu 1
- Yuhang Guo (郭宇航) 1
- Linjiang Huang 1
- HaoYang Li 1
- Yongqiang Li 1
- Zhenghua Li (李正华) 1
- Huaijun Liu 1
- Jihao Liu 1
- Rui Liu 1
- Rui Liu 1
- Si Liu 1
- Zimu Lu 1
- Bing Qin (秦兵) 1
- Houxing Ren 1
- Changyao Tian 1
- Aldrich Yu 1
- Manyuan Zhang 1
- Wei Zhang 1
- Chen Zhao 1
- Zhe Zhao 1
- Jinliang Zheng 1