Jianwei Yan


2025

pdf bib
Modeling the Law of Abbreviation in Classical, Modern, and ChatGPT-Generated Chinese: A Power-Law Analysis of Structural Economy
Jianwei Yan | Heng Chen
Proceedings of the Third Workshop on Quantitative Syntax (QUASY, SyntaxFest 2025)

This study investigates the Law of Abbreviation—the inverse relationship between word length and frequency—across Classical, Modern, and ChatGPT-generated Chinese. Using a tri-partite parallel corpus and a power-law model y = a*x^(-b), we analyze the relationship between word length and the average usage frequency of words within a given word length category to assess structural economy. Results confirm consistent Zipfian distribution across all text types, with high R2 values indicating strong model fit. However, the parameter b varies significantly: Classical Chinese shows the steepest decline, suggesting strong pressure for brevity; Modern Chinese exhibits a moderated pattern; ChatGPT-generated texts display the weakest pressure, prioritizing fluency over compression. These differences reflect evolving communicative priorities and reveal that while AI models can mimic statistical distributions, they underrepresent deeper structural pressures found in natural language evolution. This study offers new insights into lexical optimization and the parameter b offers a useful metric for comparing structural efficiency across modalities. Implications are discussed in relation to language modeling, cognitive economy, and the evolution of linguistic structure.

2022

pdf bib
How syntactic analysis influences the calculation of mean dependency distance: Evidence from the enhanced dependency representation
Tsy Yih | Jianwei Yan | Haitao Liu
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

2019

pdf bib
Which annotation scheme is more expedient to measure syntactic difficulty and cognitive demand?
Jianwei Yan | Haitao Liu
Proceedings of the First Workshop on Quantitative Syntax (Quasy, SyntaxFest 2019)