Kun Yang
2025
Parameter-free and Accessible Prompt Learning to Enhance Adversarial Robustness for Pre-trained Vision-Language Models
Xingran Zhou
|
Kun Yang
|
Changtao Miao
|
Bingyu Hu
|
Zhuoer Xu
|
Shiwen Cui
|
Changhua Meng
|
Dan Hong
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Large pre-trained Vision-Language Models (VLMs) have revolutionized both computer vision and natural language processing. Despite their success, adversarial examples can still mislead VLMs into producing incorrect results. This work focuses on boosting the adversarial robustness of VLMs by searching for text prompts at the word level, rather than optimizing continuous textual embeddings. We introduce Parameter-Free Prompt Tuning (PFPT) to learn defense words that enhance resilience against adversarial attacks when appended to existing prompts, thereby offering ease of use due to the simplicity of this approach. These defense words are naturally present in the inherent vocabulary of VLMs, providing a human-readable property. PFPT employs a coarse-to-fine strategy with carefully designed optimization objectives to guide the word search. Extensive experiments demonstrate our method’s superiority over hand-engineered prompts and other state-of-the-art methods. PFPT significantly boosts accuracy and robustness, outperforming hand-engineered prompts with average gains of +4.9% and +5.8%, respectively (epsilon=1/255).
2022
Identifying Tension in Holocaust Survivors’ Interview: Code-switching/Code-mixing as Cues
Xinyuan Xia
|
Lu Xiao
|
Kun Yang
|
Yueyue Wang
Proceedings of the Thirteenth Language Resources and Evaluation Conference
In this study, we thrive on finding out how code-switching and code-mixing (CS/CM) as a linguistic phenomenon could be a sign of tension in Holocaust survivors’ interviews. We first created an interview corpus (a total of 39 interviews) that contains manually annotated CS/CM codes (a total of 802 quotations). We then compared our annotations with the tension places in the corpus. The tensions are identified by a computational tool. We found that most of our annotations were captured in the tension places, and it showed a relatively outstanding performance. The finding implies that CS/CM can be appropriate cues for detecting tension in this communication context. Our CS/CM annotated interview corpus is openly accessible. Aside from annotating and examining CS/CM occurrences, we annotated silence situations in this open corpus. Silence is shown to be an indicator of tension in interpersonal communications. Making this corpus openly accessible, we call for more research endeavors on tension detection.