Yaxin Liu
Other people with similar names: Yaxin Liu
2024
Holistic Automated Red Teaming for Large Language Models through Top-Down Test Case Generation and Multi-turn Interaction
Jinchuan Zhang
|
Yan Zhou
|
Yaxin Liu
|
Ziming Li
|
Songlin Hu
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
Automated red teaming is an effective method for identifying misaligned behaviors in large language models (LLMs). Existing approaches, however, often focus primarily on improving attack success rates while overlooking the need for comprehensive test case coverage. Additionally, most of these methods are limited to single-turn red teaming, failing to capture the multi-turn dynamics of real-world human-machine interactions. To overcome these limitations, we propose **HARM** (**H**olistic **A**utomated **R**ed tea**M**ing), which scales up the diversity of test cases using a top-down approach based on an extensible, fine-grained risk taxonomy. Our method also leverages a novel fine-tuning strategy and reinforcement learning techniques to facilitate multi-turn adversarial probing in a human-like manner. Experimental results demonstrate that our framework enables a more systematic understanding of model vulnerabilities and offers more targeted guidance for the alignment process.
2023
QAP: A Quantum-Inspired Adaptive-Priority-Learning Model for Multimodal Emotion Recognition
Ziming Li
|
Yan Zhou
|
Yaxin Liu
|
Fuqing Zhu
|
Chuanpeng Yang
|
Songlin Hu
Findings of the Association for Computational Linguistics: ACL 2023
Multimodal emotion recognition for video has gained considerable attention in recent years, in which three modalities (i.e., textual, visual and acoustic) are involved. Due to the diverse levels of informational content related to emotion, three modalities typically possess varying degrees of contribution to emotion recognition. More seriously, there might be inconsistencies between the emotion of individual modality and the video. The challenges mentioned above are caused by the inherent uncertainty of emotion. Inspired by the recent advances of quantum theory in modeling uncertainty, we make an initial attempt to design a quantum-inspired adaptive-priority-learning model (QAP) to address the challenges. Specifically, the quantum state is introduced to model modal features, which allows each modality to retain all emotional tendencies until the final classification. Additionally, we design Q-attention to orderly integrate three modalities, and then QAP learns modal priority adaptively so that modalities can provide different amounts of information based on priority. Experimental results on the IEMOCAP and MOSEI datasets show that QAP establishes new state-of-the-art results.
Search
Fix author
Co-authors
- Songlin Hu 2
- Ziming Li 2
- Yan Zhou 2
- Chuanpeng Yang 1
- Jinchuan Zhang 1
- show all...