Fali Wang
2024
InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration
Fali Wang
|
Runxue Bao
|
Suhang Wang
|
Wenchao Yu
|
Yanchi Liu
|
Wei Cheng
|
Haifeng Chen
Findings of the Association for Computational Linguistics: EMNLP 2024
Large Language Models (LLMs) have achieved exceptional capabilities in open generation across various domains, yet they encounter difficulties with tasks that require intensive knowledge. To address these challenges, methods for integrating knowledge have been developed, which augment LLMs with domain-specific knowledge graphs through external modules. These approaches, however, face data inefficiency issues as they necessitate the processing of both known and unknown knowledge for fine-tuning. Thus, our research focuses on a novel problem: efficiently integrating unknown knowledge into LLMs without unnecessary overlap of known knowledge. A risk of introducing new knowledge is the potential forgetting of existing knowledge. To mitigate this risk, we propose the innovative InfuserKI framework. This framework employs transformer internal states to determine when to enrich LLM outputs with additional information, effectively preventing knowledge forgetting. Performance evaluations using the UMLS-2.5k and MetaQA domain knowledge graphs reveal that InfuserKI not only successfully integrates new knowledge but also outperforms state-of-the-art baselines, reducing knowledge forgetting by 9% and 6%, respectively.
2023
Maximum Entropy Loss, the Silver Bullet Targeting Backdoor Attacks in Pre-trained Language Models
Zhengxiao Liu
|
Bowen Shen
|
Zheng Lin
|
Fali Wang
|
Weiping Wang
Findings of the Association for Computational Linguistics: ACL 2023
Pre-trained language model (PLM) can be stealthily misled to target outputs by backdoor attacks when encountering poisoned samples, without performance degradation on clean samples. The stealthiness of backdoor attacks is commonly attained through minimal cross-entropy loss fine-tuning on a union of poisoned and clean samples. Existing defense paradigms provide a workaround by detecting and removing poisoned samples at pre-training or inference time. On the contrary, we provide a new perspective where the backdoor attack is directly reversed. Specifically, maximum entropy loss is incorporated in training to neutralize the minimal cross-entropy loss fine-tuning on poisoned data. We defend against a range of backdoor attacks on classification tasks and significantly lower the attack success rate. In extension, we explore the relationship between intended backdoor attacks and unintended dataset bias, and demonstrate the feasibility of the maximum entropy principle in de-biasing.
Search
Co-authors
- Bowen Shen 1
- Haifeng Chen 1
- Runxue Bao 1
- Suhang Wang 1
- Wei Cheng 1
- show all...