Fangzhou Xiong
2026
LLM Prompt Duel Optimizer: Efficient Label-Free Prompt Optimization
Yuanchen Wu | Saurabh Verma | Justin Lee | Fangzhou Xiong | Poppy Zhang | Amel Awadelkarim | Xu Chen | Yubai Yuan | Shawndra Hill
Findings of the Association for Computational Linguistics: ACL 2026
Yuanchen Wu | Saurabh Verma | Justin Lee | Fangzhou Xiong | Poppy Zhang | Amel Awadelkarim | Xu Chen | Yubai Yuan | Shawndra Hill
Findings of the Association for Computational Linguistics: ACL 2026
Large language models (LLMs) are highly sensitive to prompts, but most automatic prompt optimization (APO) methods assume access to ground-truth references (e.g., labeled validation data) that are costly to obtain. We propose the Prompt Duel Optimizer (PDO), a sample-efficient framework for label-free prompt optimization based on pairwise preference feedback from an LLM judge. PDO casts prompt selection as a dueling-bandit problem and combines (i) Double Thompson Sampling to prioritize informative comparisons under a fixed judge budget, with (ii) top-performer guided mutation to expand the candidate pool while pruning weak prompts. Experiments on BIG-bench Hard (BBH) and MS MARCO show that PDO consistently identifies stronger prompts than label-free baselines, while offering favorable quality–cost trade-offs under constrained comparison budgets.
2025
Rethinking Long Context Generation from the Continual Learning Perspective
Zeyuan Yang | Fangzhou Xiong | Peng Li | Yang Liu
Proceedings of the 31st International Conference on Computational Linguistics
Zeyuan Yang | Fangzhou Xiong | Peng Li | Yang Liu
Proceedings of the 31st International Conference on Computational Linguistics
Due to the limited context window, Large Language Models (LLMs) struggle with processing long contexts. Although fine-tuning can extend the context window, it incurs substantial computation costs. In contrast, recent tuning-free approaches reallocate the attention mechanism or incorporate temporary trainable parameters. In this work, by jointly modeling instance-level generation with a limited context window and learning over sequential data, we rethink the long context generation of LLMs from a continual learning perspective. In practice, we inspect existing representative approaches and analyze their synergy with continual learning strategies. Moreover, we integrate these strategies into current approaches to further boost LLMs’ efficiency in processing long contexts. Comprehensive experiments and analysis confirm the feasibility of continual learning insights for improving long-context processing.