Wenyao Cui

2026

ConMA : Confidence-Guided Kernel Sampling with Multi-Stage Aggregation for LLM Reasoning
Yinuo Wang | Qingjie Li | Wenyao Cui | Qiuchi Li | Zhang Huaping
Findings of the Association for Computational Linguistics: ACL 2026

Test-time scaling (TTS) enhances LLM reasoning capabilities by sampling and aggregating diverse solution trajectories. However, existing approaches often rely on external verifiers and one-shot independent sampling, which results in inefficient budget allocation and underutilizes interim high-quality trajectories. We propose ConMA, a training-free, verifier-free TTS framework that reallocates a fixed inference budget into iterative sample–filter–diversify–select cycles: it filters answer groups based on intrinsic token-probability confidence, enriches candidates through diversity-aware expansion, and employs repeated single-choice selection for multi-stage refinement. Across multiple benchmarks, ConMA consistently improves accuracy under fixed budgets. With a maximum budget of N=64, ConMA boosts Qwen3-4B to 80% accuracy on AIME25, significantly outperforming strong baselines while converging early with only 18 samples on average, substantially reducing inference cost.

2023

pdf bib abs

PsyAttention: Psychological Attention Model for Personality Detection
Baohua Zhang | Yongyi Huang | Wenyao Cui | Zhang Huaping | Jianyun Shang
Findings of the Association for Computational Linguistics: EMNLP 2023

Work on personality detection has tended to incorporate psychological features from different personality models, such as BigFive and MBTI. There are more than 900 psychological features, each of which is helpful for personality detection. However, when used in combination, the application of different calculation standards among these features may result in interference between features calculated using distinct systems, thereby introducing noise and reducing performance. This paper adapts different psychological models in the proposed PsyAttention for personality detection, which can effectively encode psychological features, reducing their number by 85%. In experiments on the BigFive and MBTI models, PysAttention achieved average accuracy of 65.66% and 86.30%, respectively, outperforming state-of-the-art methods, indicating that it is effective at encoding psychological features.

Co-authors

Yinuo Wang 1

Baohua Zhang 1

Venues

Findings2

Fix author