A Testset for Context-Aware LLM Translation in Korean-to-English Discourse Level Translation

Minjae Lee; Youngbin Noh; Seung Jin Lee

A Testset for Context-Aware LLM Translation in Korean-to-English Discourse Level Translation

Abstract

Large Language Models (LLMs) demonstrate remarkable performance in machine translation. Recent studies indicate that for high-resource languages, LLM surpasses encoder-decoder neural machine translation (NMT) models. However, evaluation datasets used in many LLM-based translation studies are often compromised by data leakage and lack demanding datasets that accurately gauge the potential and limitations of LLMs in human-like translation. This paper introduces a manually constructed Korean-English discourse-level corpus comprising 600 text instances featuring six linguistic phenomena: lexical ambiguity, zero anaphora, slang, idiom, figurative language, and implicature. Utilizing this challenge test set, we investigated LLM’s Korean-to-English translation capability, particularly in cases requiring inter-sentential context based semantic inference. The findings reveal that state-of-the-art LLM, such as GPT-4o, still struggle with specific linguistic phenomena that can be challenging for machine translation. Additionally, step-by-step prompting, such as Chain-of-Thought (CoT) prompting, significantly enhance the translation performance of LLMs compared to zero-shot prompting.

Anthology ID:: 2025.coling-main.110
Volume:: Proceedings of the 31st International Conference on Computational Linguistics
Month:: January
Year:: 2025
Address:: Abu Dhabi, UAE
Editors:: Owen Rambow, Leo Wanner, Marianna Apidianaki, Hend Al-Khalifa, Barbara Di Eugenio, Steven Schockaert
Venue:: COLING
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1632–1646
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.110/
DOI:
Bibkey:
Cite (ACL):: Minjae Lee, Youngbin Noh, and Seung Jin Lee. 2025. A Testset for Context-Aware LLM Translation in Korean-to-English Discourse Level Translation. In Proceedings of the 31st International Conference on Computational Linguistics, pages 1632–1646, Abu Dhabi, UAE. Association for Computational Linguistics.
Cite (Informal):: A Testset for Context-Aware LLM Translation in Korean-to-English Discourse Level Translation (Lee et al., COLING 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2025.coling-main.110.pdf

PDF Cite Search Fix data