Wei Zhao

Other people with similar names: Wei Zhao

Unverified author pages with similar names: Wei Zhao


2026

Updating bilingual dictionary entries is a tedious, time-consuming, and highly subjective task, especially when a new sense in the source language requires identifying an appropriate translation equivalent. To date, there have been no attempts to automatize the discovery of new bilingual sense entries. Related tasks such as Word-level Bilingual Dictionary Induction and cross-lingual embedding alignment do not account for polysemy and are not applied to lexicographic data. In contrast to their monolingual counterparts, bilingual dictionaries fall short in terms of completeness, detail with respect to examples and glosses, and diachronic information. We introduce a novel NLP task, Sense-Level Bilingual Dictionary Induction (SenseBDI), at the intersection of lexical semantics, cross-lingual, and diachronic NLP. We construct a dataset of time-stamped sense-level bilingual dictionary entries by aligning two bilingual dictionaries, two monolingual dictionaries, and the multilingual resource BabelNet, thereby enriching bilingual entries with monolingual source-language information. We propose a baseline based on nearest-neighbor search over cross-lingual embeddings of glosses and usages. We find that usages contribute more strongly than glosses, with substantial variation across language pairs and discuss task-specific challenges with regards to target language polysemy and future directions such as transfer to real-world scenarios.
Large language models (LLMs) are increasingly used for creative tasks such as literary translation. Yet translational creativity remains underexplored and is rarely evaluated at scale, while source-text comprehension is typically studied in isolation, despite the fact that, in professional translation, comprehension and creativity are tightly intertwined. We address these gaps with a paired-task framework applied to literary excerpts from 11 books. Task 1 assesses source-text comprehension, and Task 2 evaluates translational creativity through Units of Creative Potential (UCPs), such as metaphors and wordplay. Using a scalable evaluation setup that combines expert human annotations with UCP-based automatic scoring, we benchmark 23 models and four creativity-oriented prompts. Our findings show that strong comprehension does not translate into human-level creativity: models often produce literal or contextually inappropriate renderings, with particularly large gaps for the more distant English–Chinese language pair. Creativity-oriented prompts yield only modest gains, and only one model, Mistral-Large, comes close to human-level creativity (0.167 vs. 0.246). Across all model–prompt combinations, only three exceed a creativity score of 0.1, while the rest remain at or near zero.