Yulong Wang

Other people with similar names: Yulong Wang

Unverified author pages with similar names: Yulong Wang

2026

TARE: Lightweight Token-Aware Representation Editing for Fine-tuning Transformer-like Models
Yulong Wang | Siyu Zhao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Parameter-efficient fine-tuning (PEFT) must balance effectiveness and efficiency: low-rank methods can be costly, while global representation edits often underfit token-level contexts. We propose **Token-Aware Representation Editing (TARE)**, a PEFT method that performs fine-grained, token-specific edits with a small additional inference overhead and minimal tuning.After each FFN block in a transformer-like model, we adopt a lightweight selector that scores a small pool of hidden representation editors for each token, activates only the top-k editors, and mixes their element-wise scaling/bias updates. This design achieves superior performance while maintaining computational efficiency, yielding a more favorable Pareto frontier compared to state-of-the-art (SOTA) methods.Across LLaMA-3-8B (eight knowledge reasoning and seven mathematical reasoning tasks) and RoBERTa-base/large (GLUE), TARE outperforms SOTAs (LoRA, DoRA, MiLoRA, LoReFT, and RED), achieving 86.7% (knowledge reasoning), 76.7% (mathematical reasoning), and 88.3% (GLUE) while tuning only 0.0392% of parameters using about 20 GiB peak GPU memory during training.An implementation is available at: <https://github.com/PatriciaPulec/tare>.

pdf bib abs

Reference Attack: A New Cross-Modal Jailbreaking Attack against Multimodal Large Language Models
Yulong Wang | Yifei Fu | Jiayi Gao
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Red team testing, an effective proactive method for evaluating the security of multimodal large language models (MLLMs), requires an expanding toolkit alongside the development of MLLM safeguards. We propose the Reference Attack, a powerful tool for red team testing against MLLMs. The Reference Attack is a reference-guided cross-modal jailbreak method that enhances existing prompt-to-image injection attacks by exploiting MLLMs’ semantic reconstruction capabilities. Our method embeds malicious prompts in non-text modalities (e.g., images, spreadsheets) and constructs recursive symbolic references in text, enabling MLLMs to gradually recover and generate harmful content through layered reference resolution.The attack introduces a new vector that circumvents conventional content moderation by exploiting MLLMs’ lack of security checks during cross-modal reference resolution. We evaluate the Reference Attack on leading MLLMs, including ChatGPT, Gemini, Claude, and the widely used open-source LLaMA model, and achieved an attack success rate of over 93% across all tested models. Compared to state-of-the-art attacks, Reference Attack achieves higher success rates than all baselines under identical evaluation, with a maximum gain of 70.8%. Our study reveals a critical gap in MLLM security and highlights the need for strict security auditing of cross-modal interactions in future content moderation.

2025

pdf bib abs

DisLoRA: Task-specific Low-Rank Adaptation via Orthogonal Basis from Singular Value Decomposition
She Yifei | Xinhao Wei | Yulong Wang
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing

Parameter-efficient fine-tuning (PEFT) of large language models (LLMs) is critical for adapting to diverse downstream tasks with minimal computational cost. We propose **Di**rectional-**S**VD **Lo**w-**R**ank **A**daptation (DisLoRA), a novel PEFT framework that leverages singular value decomposition (SVD) to decompose pretrained weight matrices into orthogonal backbone and task-specific subspaces, enabling precise capture of task-specific directions (TSDs). By dynamically identifying TSDs and employing adaptive soft orthogonal regularization with mean-normalization mechanism, DisLoRA balances task-specific and orthogonal losses without manual tuning, ensuring robust training stability. Extensive experiments on GLUE and Commonsense Reasoning benchmarks demonstrate that DisLoRA surpasses established PEFT methods, including LoRA, PiSSA, DoRA, LoRA-Dash, and SORSA. DisLoRA achieves superior performance on multiple individual GLUE datasets, surpassing baselines by up to 10.28% on SST-2 and 3.28% on CoLA, and consistently attains higher average accuracy than baselines across Commonsense Reasoning Tasks, with a maximum gain of 3.1%. These results demonstrate DisLoRA’s performance in efficient and high-performing LLM adaptation for domain-specific tasks while preserving generalization.

Co-authors

Venues

ACL2
EMNLP1

Fix author