Yiqian Yang


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Analyzing and Modeling LLM Response Lengths with Extreme Value Theory: Anchoring Effects and Hybrid Distributions
Liuxuan Jiao | Chen Gao | Yiqian Yang | Chenliang Zhou | YiXian Huang | Xinlei Chen | Yong Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

We present a statistical framework for modeling and controlling large language model (LLM) response lengths using extreme value theory. Analyzing 14,301 GPT-4o responses across temperature and prompting conditions, with cross-validation on Qwen and DeepSeek architectures, we demonstrate that verbosity follows Weibull-type generalized extreme value (GEV) distributions with heavier tails under stochastic generation. Our key contributions include: (1) development of a novel GEV-generalized Pareto (GPD) hybrid model that improves tail fit (R2CDF=0.9993 vs standalone GEV’s 0.998) while maintaining architectural generalizability; (2) quantitative characterization of prompt anchoring effects across models, showing reduced dispersion but increased outliers under randomization; and (3) identification of temperature-dependent response patterns that persist across architectures, with higher temperatures amplifying length variability while preserving extreme-value mechanisms. The hybrid model’s threshold selection method enables precise verbosity control in production systems regardless of model choice. While validated on multiple architectures, generalizability to emerging model families requires further study.