Taiwoo Park


2025

pdf bib
Taxonomy and Analysis of Sensitive User Queries in Generative AI Search System
Hwiyeol Jo | Taiwoo Park | Hyunwoo Lee | Nayoung Choi | Changbong Kim | Ohjoon Kwon | Donghyeon Jeon | Eui-Hyeon Lee | Kyoungho Shin | Sun Suk Lim | Kyungmi Kim | Jihye Lee | Sun Kim
Findings of the Association for Computational Linguistics: NAACL 2025

Although there has been a growing interest among industries in integrating generative LLMs into their services, limited experience and scarcity of resources act as a barrier in launching and servicing large-scale LLM-based services. In this paper, we share our experiences in developing and operating generative AI models within a national-scale search engine, with a specific focus on the sensitiveness of user queries. We propose a taxonomy for sensitive search queries, outline our approaches, and present a comprehensive analysis report on sensitive queries from actual users. We believe that our experiences in launching generative AI search systems can contribute to reducing the barrier in building generative LLM-based services.

pdf bib
ZeroDL: Zero-shot Distribution Learning for Text Clustering via Large Language Models
Hwiyeol Jo | Hyunwoo Lee | Kang Min Yoo | Taiwoo Park
Findings of the Association for Computational Linguistics: ACL 2025

The advancements in large language models (LLMs) have brought significant progress in NLP tasks. However, if a task cannot be fully described in prompts, the models could fail to carry out the task. In this paper, we propose a simple yet effective method to contextualize a task toward a LLM. The method utilizes (1) open-ended zero-shot inference from the entire dataset, (2) aggregate the inference results, and (3) finally incorporate the aggregated meta-information for the actual task. We show the effectiveness in text clustering tasks, empowering LLMs to perform text-to-text-based clustering and leading to improvements on several datasets. Furthermore, we explore the generated class labels for clustering, showing how the LLM understands the task through data.

2024

pdf bib
SLM as Guardian: Pioneering AI Safety with Small Language Model
Ohjoon Kwon | Donghyeon Jeon | Nayoung Choi | Gyu-Hwung Cho | Hwiyeol Jo | Changbong Kim | Hyunwoo Lee | Inho Kang | Sun Kim | Taiwoo Park
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track

Most prior safety research of large language models (LLMs) has focused on enhancing the alignment of LLMs to better suit the safety requirements of their use cases. However, internalizing such safeguard features into larger models brought challenges of higher training cost and unintended degradation of helpfulness. In this paper, we leverage a smaller LLM for both harmful query detection and safeguard response generation. We introduce our safety requirements and the taxonomy of harmfulness categories, and then propose a multi-task learning mechanism fusing the two tasks into a single model. We demonstrate the effectiveness of our approach, providing on par or surpassing harmful query detection and safeguard response performance compared to the publicly available LLMs.