Donghyeon Lee


2025

pdf bib
ETHIC: Evaluating Large Language Models on Long-Context Tasks with High Information Coverage
Taewhoo Lee | Chanwoong Yoon | Kyochul Jang | Donghyeon Lee | Minju Song | Hyunjae Kim | Jaewoo Kang
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)

Recent advancements in large language models (LLM) capable of processing extremely long texts highlight the need for a dedicated evaluation benchmark to assess their long-context capabilities. However, existing methods, like the needle-in-a-haystack test, do not effectively assess whether these models fully utilize contextual information, raising concerns about the reliability of current evaluation techniques. To thoroughly examine the effectiveness of existing benchmarks, we introduce a new metric called information coverage (IC), which quantifies the proportion of the input context necessary for answering queries. Our findings indicate that current benchmarks exhibit low IC; although the input context may be extensive, the actual usable context is often limited. To address this, we present ETHIC, a novel benchmark designed to assess LLMs’ ability to leverage the entire context. Our benchmark comprises 1,986 test instances spanning four long-context tasks with high IC scores in the domains of books, debates, medicine, and law. Our evaluations reveal significant performance drops in contemporary LLMs, highlighting a critical challenge in managing long contexts. Our benchmark is available at https://github.com/dmis-lab/ETHIC.

2013

pdf bib
Counseling Dialog System with 5W1H Extraction
Sangdo Han | Kyusong Lee | Donghyeon Lee | Gary Geunbae Lee
Proceedings of the SIGDIAL 2013 Conference

2012

pdf bib
A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems
Seonghan Ryu | Donghyeon Lee | Injae Lee | Sangdo Han | Gary Geunbae Lee | Myungjae Kim | Kyungduk Kim
Proceedings of COLING 2012: Posters

2008

pdf bib
Transformation-based Sentence Splitting method for Statistical Machine Translation
Jonghoon Lee | Donghyeon Lee | Gary Geunbae Lee
Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST)

2007

pdf bib
POSSLT: A Korean to English Spoken Language Translation System
Donghyeon Lee | Jonghoon Lee | Gary Geunbae Lee
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT)