2025
pdf
bib
abs
ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models
Xuxu Liu
|
Siyuan Liang
|
Mengya Han
|
Yong Luo
|
Aishan Liu
|
Xiantao Cai
|
Zheng He
|
Dacheng Tao
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Generative large language models are crucial in natural language processing, but they are vulnerable to backdoor attacks, where subtle triggers compromise their behavior. Although backdoor attacks against LLMs are constantly emerging, existing benchmarks remain limited in terms of sufficient coverage of attack, metric system integrity, backdoor attack alignment. And existing pre-trained backdoor attacks are idealized in practice due to resource access constraints. Therefore we establish ELBA-Bench, a comprehensive and unified framework that allows attackers to inject backdoor through parameter efficient fine-tuning (e.g., LoRA) or without fine-tuning techniques (e.g., In-context-learning). ELBA-Bench provides over 1300 experiments encompassing the implementations of 12 attack methods, 18 datasets, and 12 LLMs. Extensive experiments provide new invaluable findings into the strengths and limitations of various attack strategies. For instance, PEFT attack consistently outperform without fine-tuning approaches in classification tasks while showing strong cross-dataset generalization with optimized triggers boosting robustness; Task-relevant backdoor optimization techniques or attack prompts along with clean and adversarial demonstrations can enhance backdoor attack success while preserving model performance on clean samples. Additionally, we introduce a universal toolbox designed for standardized backdoor attack research at https://github.com/NWPUliuxx/ELBA_Bench, with the goal of propelling further progress in this vital area.
2024
pdf
bib
abs
OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models
Shuai Wang
|
Liang Ding
|
Li Shen
|
Yong Luo
|
Bo Du
|
Dacheng Tao
Findings of the Association for Computational Linguistics: ACL 2024
Advancing automated programming necessitates robust and comprehensive code generation benchmarks, yet current evaluation frameworks largely neglect object-oriented programming (OOP) in favour of functional programming (FP), e.g., HumanEval and MBPP. To address this, our study introduces a pioneering OOP-focused benchmark, featuring 431 Python programs that encompass essential OOP concepts and features like classes and encapsulation methods. We propose a novel evaluation metric, pass@o, tailored for OOP, enhancing traditional pass@k metric. Our evaluation of 23 leading large language models (LLMs), including both general and code-specialized models, reveals three key insights: 1) pass@o offers a more relevant and comprehensive assessment for OOP code generation; 2) Despite excelling in FP, code-specialized LLMs like WizardCoder lag in OOP compared to models like ChatGPT; 3) The poor performance of all advanced LLMs on our OOP benchmark highlights a critical need for improvements in this field. Our benchmark and scripts will be publicly released at GitHub.
2022
pdf
bib
abs
Dependency Parsing via Sequence Generation
Boda Lin
|
Zijun Yao
|
Jiaxin Shi
|
Shulin Cao
|
Binghao Tang
|
Si Li
|
Yong Luo
|
Juanzi Li
|
Lei Hou
Findings of the Association for Computational Linguistics: EMNLP 2022
Dependency parsing aims to extract syntactic dependency structure or semantic dependency structure for sentences.Existing methods for dependency parsing include transition-based method, graph-based method and sequence-to-sequence method.These methods obtain excellent performance and we notice them belong to labeling method.Therefore, it may be very valuable and interesting to explore the possibility of using generative method to implement dependency parsing.In this paper, we propose to achieve Dependency Parsing (DP) via Sequence Generation (SG) by utilizing only the pre-trained language model without any auxiliary structures.We first explore different serialization designing strategies for converting parsing structures into sequences.Then we design dependency units and concatenate these units into the sequence for DPSG.We verify the DPSG is capable of parsing on widely used DP benchmarks, i.e., PTB, UD2.2, SDP15 and SemEval16.In addition, we also investigate the astonishing low-resource applicability of DPSG, which includes unsupervised cross-domain conducted on CODT and few-shot cross-task conducted on SDP15.Our research demonstrates that sequence generation is one of the effective methods to achieve dependency parsing.Our codes are available now.
2021
pdf
bib
abs
Unsupervised Domain Adaptation Method with Semantic-Structural Alignment for Dependency Parsing
Boda Lin
|
Mingzheng Li
|
Si Li
|
Yong Luo
Findings of the Association for Computational Linguistics: EMNLP 2021
Unsupervised cross-domain dependency parsing is to accomplish domain adaptation for dependency parsing without using labeled data in target domain. Existing methods are often of the pseudo-annotation type, which generates data through self-annotation of the base model and performing iterative training. However, these methods fail to consider the change of model structure for domain adaptation. In addition, the structural information contained in the text cannot be fully exploited. To remedy these drawbacks, we propose a Semantics-Structure Adaptative Dependency Parser (SSADP), which accomplishes unsupervised cross-domain dependency parsing without relying on pseudo-annotation or data selection. In particular, we design two feature extractors to extract semantic and structural features respectively. For each type of features, a corresponding feature adaptation method is utilized to achieve domain adaptation to align the domain distribution, which effectively enhances the unsupervised cross-domain transfer capability of the model. We validate the effectiveness of our model by conducting experiments on the CODT1 and CTB9 respectively, and the results demonstrate that our model can achieve consistent performance improvement. Besides, we verify the structure transfer ability of the proposed model by introducing Weisfeiler-Lehman Test.