Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset

Seunguk Yu; Kyeonghyun Kim; Jungmin Yun; Youngbin Kim

Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset

Seunguk Yu, Kyeonghyun Kim, JungMin Yun, YoungBin Kim

Abstract

Although LLMs have made significant progress in various languages, there are still concerns about their effectiveness with low-resource agglutinative languages compared to languages such as English. In this study, we focused on Korean, a language known for its complex sentence endings, and evaluated LLMs on this challenging aspect. We introduce the Korean Sentence Endings (KoSEnd) dataset, which includes 3,000 sentences, each annotated for the naturalness of 15 sentence ending forms. These were collected from diverse sources to cover a range of contexts. We evaluated 11 LLMs to assess their understanding of Korean sentence endings, analyzing them based on parameter count and prediction consistency. Notably, we found that informing models about the possibility of missing sentence endings improved performance, highlighting the impact of explicitly considering certain linguistic features.

Anthology ID:: 2025.acl-srw.29
Volume:: Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)
Month:: July
Year:: 2025
Address:: Vienna, Austria
Editors:: Jin Zhao, Mingyang Wang, Zhu Liu
Venues:: ACL | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 455–469
Language:
URL:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-srw.29/
DOI:
Bibkey:
Cite (ACL):: Seunguk Yu, Kyeonghyun Kim, JungMin Yun, and YoungBin Kim. 2025. Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop), pages 455–469, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):: Making Sense of Korean Sentences: A Comprehensive Evaluation of LLMs through KoSEnd Dataset (Yu et al., ACL 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-acl-25/2025.acl-srw.29.pdf

PDF Cite Search Fix data