Yunho Maeng
2025
QGuard:Question-based Zero-shot Guard for Multi-modal LLM Safety
Taegyeong Lee
|
Jeonghwa Yoo
|
Hyoungseo Cho
|
Soo Yong Kim
|
Yunho Maeng
Proceedings of the The 9th Workshop on Online Abuse and Harms (WOAH)
The recent advancements in Large Language Models(LLMs) have had a significant impact on a wide range of fields, from general domains to specialized areas. However, these advancements have also significantly increased the potential for malicious users to exploit harmful and jailbreak prompts for malicious attacks. Although there have been many efforts to prevent harmful prompts and jailbreak prompts, protecting LLMs from such malicious attacks remains an important and challenging task. In this paper, we propose QGuard, a simple yet effective safety guard method, that utilizes question prompting to block harmful prompts in a zero-shot manner. Our method can defend LLMs not only from text-based harmful prompts but also from multi-modal harmful prompt attacks. Moreover, by diversifying and modifying guard questions, our approach remains robust against the latest harmful prompts without fine-tuning. Experimental results show that our model performs competitively on both text-only and multi-modal harmful datasets. Additionally, by providing an analysis of question prompting, we enable a white-box analysis of user inputs. We believe our method provides valuable insights for real-world LLM services in mitigating security risks associated with harmful prompts.
Typed-RAG: Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation
DongGeon Lee
|
Ahjeong Park
|
Hyeri Lee
|
Hyeonseo Nam
|
Yunho Maeng
Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)
Non-factoid question answering (NFQA) poses a significant challenge due to its open-ended nature, diverse intents, and the necessity for multi-aspect reasoning, rendering conventional retrieval-augmented generation (RAG) approaches insufficient. To address this, we introduce Typed-RAG, a type-aware framework utilizing multi-aspect query decomposition tailored specifically for NFQA. Typed-RAG categorizes NFQs into distinct types—such as debate, experience, and comparison—and decomposes them into single-aspect sub-queries for targeted retrieval and generation. By synthesizing the retrieved results of these sub-queries, Typed-RAG generates more informative and contextually relevant responses. Additionally, we present Wiki-NFQA, a novel benchmark dataset encompassing diverse NFQ types. Experimental evaluation demonstrates that TypeRAG consistently outperforms baseline approaches, confirming the effectiveness of type-aware decomposition in improving both retrieval quality and answer generation for NFQA tasks.
Search
Fix author
Co-authors
- Hyoungseo Cho 1
- Soo Yong Kim 1
- Taegyeong Lee 1
- Donggeon Lee 1
- Hyeri Lee 1
- show all...