Yizhe WANG

Also published as: Yizhe Wang

2026

From Behavior to Geometry: A Causal and Geometric Analysis of LoRA-Based Domain Adaptation
Yizhe WANG | Liu He | Zhenhua Ling
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Parameter-efficient fine-tuning with Low-Rank Adaptation (LoRA) often improves a large language model’s in-domain performance at the cost of cross-domain generalization. We investigate the mechanistic basis for this trade-off, asking whether LoRA creates new discriminative directions in representation space (emergence) or merely reshapes pre-existing ones. Using a Word Sense Disambiguation testbed, we couple controlled behavioral evaluation with causal localization and geometric diagnostics. We find LoRA learns new, spatially localized discriminative directions in the middle layers of the network, focused at token positions critical for the task. This "subspace extension" account explains why LoRA-tuned models excel on in-domain data but struggle to transfer. As a proof of concept, we introduce a mechanistically informed LoRA configuration that concentrates capacity in the identified layers, promotes rank diversity, and applies light answer-token calibration. Without increasing training budget, it yields consistent improvements in both in- and cross-domain settings, demonstrating that mechanistic insight can guide more efficient adaptation.

2021

pdf bib abs

Caractérisation des relations sémantiques entre termes multi-mots fondée sur l’analogie (Semantic relations recognition between multi-word terms by means of analogy )
Yizhe Wang | Béatrice Daille | Nabil Hathout
Actes de la 28e Conférence sur le Traitement Automatique des Langues Naturelles. Volume 1 : conférence principale

La terminologie d’un domaine rend compte de la structure du domaine grâce aux relations entre ses termes. Dans cet article, nous nous intéressons à la caractérisation des relations terminologiques qui existent entre termes multi-mots (MWT) dans les espaces vectoriels distributionnels. Nous avons constitué un jeu de données composé de MWT en français du domaine de l’environnement, reliés par des relations sémantiques lexicales. Nous présentons une expérience dans laquelle ces relations sémantiques entre MWT sont caractérisées au moyen de l’analogie. Les résultats obtenus permettent d’envisager un processus automatique pour aider à la structuration des terminologies.

2020

pdf bib abs

A study of semantic projection from single word terms to multi-word terms in the environment domain
Yizhe Wang | Beatrice Daille | Nabil Hathout
Proceedings of the 6th International Workshop on Computational Terminology

The semantic projection method is often used in terminology structuring to infer semantic relations between terms. Semantic projection relies upon the assumption of semantic compositionality: the relation that links simple term pairs remains valid in pairs of complex terms built from these simple terms. This paper proposes to investigate whether this assumption commonly adopted in natural language processing is actually valid. First, we describe the process of constructing a list of semantically linked multi-word terms (MWTs) related to the environmental field through the extraction of semantic variants. Second, we present our analysis of the results from the semantic projection. We find that contexts play an essential role in defining the relations between MWTs.

2018

pdf bib abs

Apprentissage déséquilibré pour la détection des signaux de l’implication durable dans les conversations en parfumerie (Automatic detection of positive enduring involvement signals in fragrance products reviews)
Yizhe Wang | Damien Nouvel | Gaël Patin | Marguerite Leenhardt
Actes de la Conférence TALN. Volume 1 - Articles longs, articles courts de TALN

Une simple détection d’opinions positives ou négatives ne satisfait plus les chercheurs et les entreprises. Le monde des affaires est à la recherche d’un «aperçu des affaires». Beaucoup de méthodes peuvent être utilisées pour traiter le problème. Cependant, leurs performances, lorsque les classes ne sont pas équilibrées, peuvent être dégradées. Notre travail se concentre sur l’étude des techniques visant à traiter les données déséquilibrées en parfumerie. Cinq méthodes ont été comparées : Smote, Adasyn, Tomek links, Smote-TL et la modification du poids des classe. L’algorithme d’apprentissage choisi est le SVM et l’évaluation est réalisée par le calcul des scores de précision, de rappel et de f-mesure. Selon les résultats expérimentaux, la méthode en ajustant le poids sur des coût d’erreurs avec SVM, nous permet d’obtenir notre meilleure F-mesure.

Co-authors

Damien Nouvel 1

Gaël Patin 1

Venues

Fix author