Hyeongsik Kim

Also published as: HyeongSik Kim


2025

pdf bib
Over-Generation and Compaction: A Prompting Strategy for Procedural Text Adaptation with Large Language Models
Hyeongsik Kim | Yanheng Xu | Chaoqun Dong | Fei Du
Findings of the Association for Computational Linguistics: EMNLP 2025

Procedural text adaptation—such as modifying recipes or revising instructional guides—has traditionally relied on specialized models extensively fine‐tuned for specific domains. To address the scalability limitations of such approaches, recent research has increasingly turned to general‐purpose large language models (LLMs). However, existing prompting strategies for LLMs often yield superficial or erroneous adaptations due to alignment‐induced biases and the inherent complexity of procedural editing. To overcome these challenges, we propose the Over‐generation‐and‐Compaction (OC) prompting strategy, which first elicits an exhaustive set of procedural details to leverage the model’s latent knowledge, and subsequently compacts them into concise, coherent adaptations. We further introduce Recipe Consistency & Feasibility (RCF), a novel metric for systematically assessing procedural validity and practicality in cooking recipe adaptations. Experiments on public datasets demonstrate that OC significantly improves adaptation consistency and feasibility compared to baseline prompting methods, without the need for additional fine-tuning or curated training resources.

2023

pdf bib
A Textual Dataset for Situated Proactive Response Selection
Naoki Otani | Jun Araki | HyeongSik Kim | Eduard Hovy
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Recent data-driven conversational models are able to return fluent, consistent, and informative responses to many kinds of requests and utterances in task-oriented scenarios. However, these responses are typically limited to just the immediate local topic instead of being wider-ranging and proactively taking the conversation further, for example making suggestions to help customers achieve their goals. This inadequacy reflects a lack of understanding of the interlocutor’s situation and implicit goal. To address the problem, we introduce a task of proactive response selection based on situational information. We present a manually-curated dataset of 1.7k English conversation examples that include situational background information plus for each conversation a set of responses, only some of which are acceptable in the situation. A responsive and informed conversation system should select the appropriate responses and avoid inappropriate ones; doing so demonstrates the ability to adequately understand the initiating request and situation. Our benchmark experiments show that this is not an easy task even for strong neural models, offering opportunities for future research.

pdf bib
On the Underspecification of Situations in Open-domain Conversational Datasets
Naoki Otani | Jun Araki | HyeongSik Kim | Eduard Hovy
Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)

Advances of open-domain conversational systems have been achieved through the creation of numerous conversation datasets. However, many of the commonly used datasets contain little or no information about the conversational situation, such as relevant objects/people, their properties, and relationships. This absence leads to underspecification of the problem space and typically results in undesired dialogue system behavior. This position paper discusses the current state of the field associated with processing situational information. An analysis of response generation using three datasets shows that explicitly provided situational information can improve the coherence and specificity of generated responses, but further experiments reveal that generation systems can be misled by irrelevant information. Our conclusions from this evaluation provide insights into the problem and directions for future research.