Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models

Thomas Stephan Juzek, Xiaoyang Ming, Jose A. Hernandez


Abstract
The language used by digital chat assistants such as ChatGPT can diverge from human expectations (misalignment). Research, mostly on Scientific English, has described both WHAT divergences occur and, to some extent, WHY, linking them to the training stage of human preference learning. Yet, existing approaches rely on manual curation. This paper introduces two curation-free, assumption-light evaluation metrics: the Lexical Alignment Score, which identifies lexical overuse, and the Triangulated Preference Shift, which quantifies how much of such shifts can be attributed to human preference learning. Using PubMed abstracts, continuations were generated and measured using windowed document prevalence across six model families (Falcon, Gemma, Llama, Mistral, OLMo, Yi). The procedure identifies, without manual intervention, overused items such as ’suggest’, ’additionally’, and ’strategy’, and estimates their link to preference learning. Our findings replicate prior work and remain stable across parameter settings, random seeds, and evaluation on further data. The approach scales readily and enables systematic study of lexical (mis)alignment beyond Scientific English and across languages, and as such, the metrics have the potential to contribute to improved alignment for future models and understanding of its origins.
Anthology ID:
2026.lrec-main.484
Volume:
Proceedings of the Fifteenth Language Resources and Evaluation Conference
Month:
May
Year:
2026
Address:
Palma de Mallorca, Spain
Editors:
Stelios Piperidis, Núria Bel, Henk van den Heuvel, Nancy Ide, Simon Krek, Antonio Toral
Venue:
LREC
SIG:
Publisher:
ELRA Language Resource Association
Note:
Pages:
6116–6131
Language:
URL:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.484/
DOI:
Bibkey:
Cite (ACL):
Thomas Stephan Juzek, Xiaoyang Ming, and Jose A. Hernandez. 2026. Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models. International Conference on Language Resources and Evaluation, main:6116–6131.
Cite (Informal):
Fully Automated Identification of Lexical Alignment and Preference-Stage Shifts in Large Language Models (Juzek et al., LREC 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-lrec/2026.lrec-main.484.pdf