Felix Mao


2026

This paper investigates the distinctive linguistic characteristics of regional English variants through a quantitative analysis of global media coverage. The study applies advanced classification techniques, integrating GPT-based embeddings with Support Vector Machines, to a novel corpus, the Olympic Journalism English Variants Corpus. Comprising news articles related to Olympic Games covered by prominent news outlets in the United States, China, Spain, and Mexico between 2020 and 2023, this corpus enables a fine-grained analysis of 164 linguistic features across lexical, syntactic, readability, and sentiment dimensions. The findings reveal strong and interpretable distinctions in features such as verb ratio, nominality, and readability. This study not only demonstrated the enhanced classification capabilities of the model (optimized F1 score = 97.2), but also yielded deeper, data-driven stylistic analysis and insights of each English variant. This work provides a potential template that can be expanded to other World Englishes research.
Generation Z’s mental health discourse has been uniquely shaped by digital saturation and the COVID-19 pandemic. This study introduces a large-scale corpus of Gen Z mental health discourse on Reddit, comprising over 3 million posts across 11 subreddits (2017–2025), identified through behavioral cross-posting between mental health and Gen Z-identified communities. Using a hybrid methodology that integrates statistical corpus linguistics with NLP techniques, we conduct diachronic keyness analysis, sentiment tracking, and topic modeling to examine lexical, syntactic, and semantic patterns across pre-, during-, and post-COVID periods. Our analysis reveals: (1) ritualized support exchanges more pronounced in Gen Z where highly negative self-disclosure functions as an authenticity signal; (2) a pandemic-induced reframing of existing mental health topics, particularly a rise in physical symptoms, followed by a sustained post-pandemic sentiment decline; and (3) a generational divergence where Gen Z favors abstract, existential concerns, unlike the pragmatic focus of non-Gen Z users. This study contributes a replicable approach for analyzing youth discourse and underscores the importance of culturally and linguistically informed digital mental health interventions, which can support Gen Z’s modes of expressing distress rather than pathologizing them.
Search
Co-authors
    Venues
    Fix author