Yeonsoo Lee


HaRiM+: Evaluating Summary Quality with Hallucination Risk
Seonil (Simon) Son | Junsoo Park | Jeong-in Hwang | Junghwa Lee | Hyungjong Noh | Yeonsoo Lee
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

One of the challenges of developing a summarization model arises from the difficulty in measuring the factual inconsistency of the generated text. In this study, we reinterpret the decoder overconfidence-regularizing objective suggested in (Miao et al., 2021) as a hallucination risk measurement to better estimate the quality of generated summaries. We propose a reference-free metric, HaRiM+, which only requires an off-the-shelf summarization model to compute the hallucination risk based on token likelihoods. Deploying it requires no additional training of models or ad-hoc modules, which usually need alignment to human judgments. For summary-quality estimation, HaRiM+ records state-of-the-art correlation to human judgment on three summary-quality annotation sets: FRANK, QAGS, and SummEval. We hope that our work, which merits the use of summarization models, facilitates the progress of both automated evaluation and generation of summary.

pdf bib
Proceedings of the 1st Workshop on Customized Chat Grounding Persona and Knowledge
Heuiseok Lim | Seungryong Kim | Yeonsoo Lee | Steve Lin | Paul Hongsuck Seo | Yumin Suh | Yoonna Jang | Jungwoo Lim | Yuna Hur | Suhyune Son
Proceedings of the 1st Workshop on Customized Chat Grounding Persona and Knowledge


Two-Step Training and Mixed Encoding-Decoding for Implementing a Generative Chatbot with a Small Dialogue Corpus
Jintae Kim | Hyeon-Gu Lee | Harksoo Kim | Yeonsoo Lee | Young-Gil Kim
Proceedings of the Workshop on Intelligent Interactive Systems and Language Generation (2IS&NLG)