Yuxi Li


2026

Generative Search Engines (GSEs) have reshaped information retrieval, and Generative Engine Optimization (GEO) emerges to improve the content visibility in GSEs’ responses. Previous methods mainly rely on empirical strategies or query-dependent preferences of GSEs for content optimization. However, they remain limited in effectiveness as they overlook the latent user search demands in queries that drive content retrieval and response generation of GSEs. To address this, we propose Mind Reader, a novel GEO method to effectively improve the content visibility within the generated responses of GSEs through content optimization guided by the extracted latent demands of user search. Specifically, we propose a decomposition-recombination query augmentation module, which enriches the query with latent semantic information by decomposing it into diverse perspectives, capturing underlying semantic information, and recombining them into variants to support subsequent optimization. Then, we propose a reasoning coverage content optimization module. By optimizing content to cover critical reasoning information of GSEs, we align the content with the user search demands, effectively improving the content visibility. Extensive experiments on widely used GEO-Bench and our proposed PC-GEO show that our method significantly outperforms baselines and effectively improves content visibility (with up to 2.44x objective metrics and 1.23x subjective metrics on average).

2024

Aspect, a linguistic category describing how actions and events unfold over time, is traditionally characterized by three semantic properties: stativity, durativity and telicity. In this study, we investigate whether and to what extent these properties are encoded in the verb token embeddings of the contextualized spaces of two English language models – BERT and GPT-2. First, we propose an experiment using semantic projections to examine whether the values of the vector dimensions of annotated verbs for stativity, durativity and telicity reflect human linguistic distinctions. Second, we use distributional similarity to replicate the notorious Imperfective Paradox described by Dowty (1977), and assess whether the embedding models are sensitive to capture contextual nuances of the verb telicity. Our results show that both models encode the semantic distinctions for the aspect properties of stativity and telicity in most of their layers, while durativity is the most challenging feature. As for the Imperfective Paradox, only the embedding similarities computed with the vectors from the early layers of the BERT model align with the expected pattern.

2023