Linyang Li


2021

pdf bib
SENT: Sentence-level Distant Relation Extraction via Negative Training
Ruotian Ma | Tao Gui | Linyang Li | Qi Zhang | Xuanjing Huang | Yaqian Zhou
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Distant supervision for relation extraction provides uniform bag labels for each sentence inside the bag, while accurate sentence labels are important for downstream applications that need the exact relation type. Directly using bag labels for sentence-level training will introduce much noise, thus severely degrading performance. In this work, we propose the use of negative training (NT), in which a model is trained using complementary labels regarding that “the instance does not belong to these complementary labels”. Since the probability of selecting a true label as a complementary label is low, NT provides less noisy information. Furthermore, the model trained with NT is able to separate the noisy data from the training data. Based on NT, we propose a sentence-level framework, SENT, for distant relation extraction. SENT not only filters the noisy data to construct a cleaner dataset, but also performs a re-labeling process to transform the noisy data into useful training data, thus further benefiting the model’s performance. Experimental results show the significant improvement of the proposed method over previous methods on sentence-level evaluation and de-noise effect.

pdf bib
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Linyang Li | Demin Song | Xiaonan Li | Jiehang Zeng | Ruotian Ma | Xipeng Qiu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Pre-Trained Models have been widely applied and recently proved vulnerable under backdoor attacks: the released pre-trained weights can be maliciously poisoned with certain triggers. When the triggers are activated, even the fine-tuned model will predict pre-defined labels, causing a security threat. These backdoors generated by the poisoning methods can be erased by changing hyper-parameters during fine-tuning or detected by finding the triggers. In this paper, we propose a stronger weight-poisoning attack method that introduces a layerwise weight poisoning strategy to plant deeper backdoors; we also introduce a combinatorial trigger that cannot be easily detected. The experiments on text classification tasks show that previous defense methods cannot resist our weight-poisoning method, which indicates that our method can be widely applied and may provide hints for future model robustness studies.

pdf bib
Searching for an Effective Defender: Benchmarking Defense against Adversarial Word Substitution
Zongyi Li | Jianhan Xu | Jiehang Zeng | Linyang Li | Xiaoqing Zheng | Qi Zhang | Kai-Wei Chang | Cho-Jui Hsieh
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Recent studies have shown that deep neural network-based models are vulnerable to intentionally crafted adversarial examples, and various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models. However, there is a lack of systematic study on comparing different defense approaches under the same attacking setting. In this paper, we seek to fill the gap of systematic studies through comprehensive researches on understanding the behavior of neural text classifiers trained by various defense methods under representative adversarial attacks. In addition, we propose an effective method to further improve the robustness of neural text classifiers against such attacks, and achieved the highest accuracy on both clean and adversarial examples on AGNEWS and IMDB datasets by a significant margin. We hope this study could provide useful clues for future research on text adversarial defense. Codes are available at https://github.com/RockyLzy/TextDefender.

2020

pdf bib
BERT-ATTACK: Adversarial Attack Against BERT Using BERT
Linyang Li | Ruotian Ma | Qipeng Guo | Xiangyang Xue | Xipeng Qiu
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Adversarial attacks for discrete data (such as texts) have been proved significantly more challenging than continuous data (such as images) since it is difficult to generate adversarial samples with gradient-based methods. Current successful attack methods for texts usually adopt heuristic replacement strategies on the character or word level, which remains challenging to find the optimal solution in the massive space of possible combinations of replacements while preserving semantic consistency and language fluency. In this paper, we propose BERT-Attack, a high-quality and effective method to generate adversarial samples using pre-trained masked language models exemplified by BERT. We turn BERT against its fine-tuned models and other deep neural models in downstream tasks so that we can successfully mislead the target models to predict incorrectly. Our method outperforms state-of-the-art attack strategies in both success rate and perturb percentage, while the generated adversarial samples are fluent and semantically preserved. Also, the cost of calculation is low, thus possible for large-scale generations. The code is available at https://github.com/LinyangLee/BERT-Attack.