Aishan Liu
2025
ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models
Xuxu Liu
|
Siyuan Liang
|
Mengya Han
|
Yong Luo
|
Aishan Liu
|
Xiantao Cai
|
Zheng He
|
Dacheng Tao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Generative large language models are crucial in natural language processing, but they are vulnerable to backdoor attacks, where subtle triggers compromise their behavior. Although backdoor attacks against LLMs are constantly emerging, existing benchmarks remain limited in terms of sufficient coverage of attack, metric system integrity, backdoor attack alignment. And existing pre-trained backdoor attacks are idealized in practice due to resource access constraints. Therefore we establish ELBA-Bench, a comprehensive and unified framework that allows attackers to inject backdoor through parameter efficient fine-tuning (e.g., LoRA) or without fine-tuning techniques (e.g., In-context-learning). ELBA-Bench provides over 1300 experiments encompassing the implementations of 12 attack methods, 18 datasets, and 12 LLMs. Extensive experiments provide new invaluable findings into the strengths and limitations of various attack strategies. For instance, PEFT attack consistently outperform without fine-tuning approaches in classification tasks while showing strong cross-dataset generalization with optimized triggers boosting robustness; Task-relevant backdoor optimization techniques or attack prompts along with clean and adversarial demonstrations can enhance backdoor attack success while preserving model performance on clean samples. Additionally, we introduce a universal toolbox designed for standardized backdoor attack research at https://github.com/NWPUliuxx/ELBA_Bench, with the goal of propelling further progress in this vital area.
2020
Dialogue Policies for Learning Board Games through Multimodal Communication
Maryam Zare
|
Ali Ayub
|
Aishan Liu
|
Sweekar Sudhakara
|
Alan Wagner
|
Rebecca Passonneau
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue
This paper presents MDP policy learning for agents to learn strategic behavior–how to play board games–during multimodal dialogues. Policies are trained offline in simulation, with dialogues carried out in a formal language. The agent has a temporary belief state for the dialogue, and a persistent knowledge store represented as an extensive-form game tree. How well the agent learns a new game from a dialogue with a simulated partner is evaluated by how well it plays the game, given its dialogue-final knowledge state. During policy training, we control for the simulated dialogue partner’s level of informativeness in responding to questions. The agent learns best when its trained policy matches the current dialogue partner’s informativeness. We also present a novel data collection for training natural language modules. Human subjects who engaged in dialogues with a baseline system rated the system’s language skills as above average. Further, results confirm that human dialogue partners also vary in their informativeness.
Search
Fix author
Co-authors
- Ali Ayub 1
- Xiantao Cai 1
- Mengya Han 1
- Zheng He (何铮) 1
- Siyuan Liang 1
- show all...