Kuo-Han Hung


2025

pdf bib
Attention Tracker: Detecting Prompt Injection Attacks in LLMs
Kuo-Han Hung | Ching-Yun Ko | Ambrish Rawat | I-Hsin Chung | Winston H. Hsu | Pin-Yu Chen
Findings of the Association for Computational Linguistics: NAACL 2025

Large Language Models (LLMs) have revolutionized various domains but remain vulnerable to prompt injection attacks, where malicious inputs manipulate the model into ignoring original instructions and executing designated action. In this paper, we investigate the underlying mechanisms of these attacks by analyzing the attention patterns within LLMs. We introduce the concept of the distraction effect, where specific attention heads, termed important heads, shift focus from the original instruction to the injected instruction. Building on this discovery, we propose Attention Tracker, a training-free detection method that tracks attention patterns on instruction to detect prompt injection attacks without the need for additional LLM inference. Our method generalizes effectively across diverse models, datasets, and attack types, showing an AUROC improvement of up to 10.0% over existing methods, and performs well even on small LLMs. We demonstrate the robustness of our approach through extensive evaluations and provide insights into safeguarding LLM-integrated systems from prompt injection vulnerabilities.

pdf bib
MSR2: A Benchmark for Multi-Source Retrieval and Reasoning in Visual Question Answering
Kuo-Han Hung | Hung-Chieh Fang | Chao-Wei Huang | Yun-Nung Chen
Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing

This paper introduces MSR2, a benchmark for multi-source retrieval and reasoning in visual question answering. Unlike previous knowledge-based visual question answering datasets, MSR2 focuses on questions involving multiple fine-grained entities, providing a unique opportunity to assess a model’s spatial reasoning ability and its capacity to retrieve and aggregate information from various sources for different entities. Through comprehensive evaluation using MSR2, we gain valuable insights into the capabilities and limitations of state-of-the-art large vision-language models (LVLMs).Our findings reveal that even state-of-the-art LVLMs struggle with questions requiring multi-entities and knowledge-intensive reasoning, highlighting important new directions for future research.Additionally, we demonstrate that enhanced visual entity recognition and knowledge retrieval can significantly improve performance on MSR2, pinpointing key areas for advancement.

2022

pdf bib
Open-Domain Conversational Question Answering with Historical Answers
Hung-Chieh Fang | Kuo-Han Hung | Chen-Wei Huang | Yun-Nung Chen
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022

Open-domain conversational question answering can be viewed as two tasks: passage retrieval and conversational question answering, where the former relies on selecting candidate passages from a large corpus and the latter requires better understanding of a question with contexts to predict the answers. This paper proposes ConvADR-QA that leverages historical answers to boost retrieval performance and further achieves better answering performance. Our experiments on the benchmark dataset, OR-QuAC, demonstrate that our model outperforms existing baselines in both extractive and generative reader settings, well justifying the effectiveness of historical answers for open-domain conversational question answering.