Chenliang Zhou


2025

pdf bib
Analyzing and Modeling LLM Response Lengths with Extreme Value Theory: Anchoring Effects and Hybrid Distributions
Liuxuan Jiao | Chen Gao | Yiqian Yang | Chenliang Zhou | YiXian Huang | Xinlei Chen | Yong Li
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

We present a statistical framework for modeling and controlling large language model (LLM) response lengths using extreme value theory. Analyzing 14,301 GPT-4o responses across temperature and prompting conditions, with cross-validation on Qwen and DeepSeek architectures, we demonstrate that verbosity follows Weibull-type generalized extreme value (GEV) distributions with heavier tails under stochastic generation. Our key contributions include: (1) development of a novel GEV-generalized Pareto (GPD) hybrid model that improves tail fit (R2CDF=0.9993 vs standalone GEV’s 0.998) while maintaining architectural generalizability; (2) quantitative characterization of prompt anchoring effects across models, showing reduced dispersion but increased outliers under randomization; and (3) identification of temperature-dependent response patterns that persist across architectures, with higher temperatures amplifying length variability while preserving extreme-value mechanisms. The hybrid model’s threshold selection method enables precise verbosity control in production systems regardless of model choice. While validated on multiple architectures, generalizability to emerging model families requires further study.

2023

pdf bib
Information Compression via Eliding Verb Phrase: A Dependency-Based Study
Zheyuan Dai | Chenliang Zhou | Haitao Liu
Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation

2022

pdf bib
Discourse Markers as the Classificatory Factors of Speech Acts
Da Qi | Chenliang Zhou | Haitao Liu
Proceedings of the 21st Chinese National Conference on Computational Linguistics

“Since the debut of the speech act theory, the classification standards of speech acts have been in dispute. Traditional abstract taxonomies seem insufficient to meet the needs of artificial intelligence for identifying and even understanding speech acts. To facilitate the automatic identification of the communicative intentions in human dialogs, scholars have tried some data-driven methods based on speech-act annotated corpora. However, few studies have objectively evaluated those classification schemes. In this regard, the current study applied the frequencies of the eleven discourse markers (oh, well, and, but, or, so, because, now, then, I mean, and you know) proposed by Schiffrin (1987) to investigate whether they can be effective indicators of speech act variations. The results showed that the five speech acts of Agreement can be well classified in terms of their functions by the frequencies of discourse markers. Moreover, it was found that the discourse markers well and oh are rather efficacious in differentiating distinct speech acts. This paper indicates that quantitative indexes can reflect the characteristics of human speech acts, and more objective and data-based classification schemes might be achieved based on these metrics.”