Donggeon Lee

Also published as: DongGeon Lee

2025

pdf bib abs
Theme-Explanation Structure for Table Summarization using Large Language Models: A Case Study on Korean Tabular Data
TaeYoon Kwack | Jisoo Kim | Ki Yong Jung | DongGeon Lee | Heesun Park
Proceedings of the 4th Table Representation Learning Workshop

Tables are a primary medium for conveying critical information in administrative domains, yet their complexity hinders utilization by Large Language Models (LLMs). This paper introduces the Theme-Explanation Structure-based Table Summarization (Tabular-TX) pipeline, a novel approach designed to generate highly interpretable summaries from tabular data, with a specific focus on Korean administrative documents. Current table summarization methods often neglect the crucial aspect of human-friendly output. Tabular-TX addresses this by first employing a multi-step reasoning process to ensure deep table comprehension by LLMs, followed by a journalist persona prompting strategy for clear sentence generation. Crucially, it then structures the output into a Theme Part (an adverbial phrase) and an Explanation Part (a predicative clause), significantly enhancing readability. Our approach leverages in-context learning, obviating the need for extensive fine-tuning and associated labeled data or computational resources. Experimental results show that Tabular-TX effectively processes complex table structures and metadata, offering a robust and efficient solution for generating human-centric table summaries, especially in low-resource scenarios.

pdf bib abs
Typed-RAG: Type-Aware Decomposition of Non-Factoid Questions for Retrieval-Augmented Generation
DongGeon Lee | Ahjeong Park | Hyeri Lee | Hyeonseo Nam | Yunho Maeng
Proceedings of the 1st Joint Workshop on Large Language Models and Structure Modeling (XLLM 2025)

Non-factoid question answering (NFQA) poses a significant challenge due to its open-ended nature, diverse intents, and the necessity for multi-aspect reasoning, rendering conventional retrieval-augmented generation (RAG) approaches insufficient. To address this, we introduce Typed-RAG, a type-aware framework utilizing multi-aspect query decomposition tailored specifically for NFQA. Typed-RAG categorizes NFQs into distinct types—such as debate, experience, and comparison—and decomposes them into single-aspect sub-queries for targeted retrieval and generation. By synthesizing the retrieved results of these sub-queries, Typed-RAG generates more informative and contextually relevant responses. Additionally, we present Wiki-NFQA, a novel benchmark dataset encompassing diverse NFQ types. Experimental evaluation demonstrates that TypeRAG consistently outperforms baseline approaches, confirming the effectiveness of type-aware decomposition in improving both retrieval quality and answer generation for NFQA tasks.

2020

pdf bib abs
Generating Equation by Utilizing Operators : GEO model
Kyung Seo Ki | Donggeon Lee | Bugeun Kim | Gahgene Gweon
Proceedings of the 28th International Conference on Computational Linguistics

Math word problem solving is an emerging research topic in Natural Language Processing. Recently, to address the math word problem-solving task, researchers have applied the encoder-decoder architecture, which is mainly used in machine translation tasks. The state-of-the-art neural models use hand-crafted features and are based on generation methods. In this paper, we propose the GEO (Generation of Equations by utilizing Operators) model that does not use hand-crafted features and addresses two issues that are present in existing neural models: 1. missing domain-specific knowledge features and 2. losing encoder-level knowledge. To address missing domain-specific feature issue, we designed two auxiliary tasks: operation group difference prediction and implicit pair prediction. To address losing encoder-level knowledge issue, we added an Operation Feature Feed Forward (OP3F) layer. Experimental results showed that the GEO model outperformed existing state-of-the-art models on two datasets, 85.1% in MAWPS, and 62.5% in DRAW-1K, and reached comparable performance of 82.1% in ALG514 dataset.

pdf bib abs
Point to the Expression: Solving Algebraic Word Problems using the Expression-Pointer Transformer Model
Bugeun Kim | Kyung Seo Ki | Donggeon Lee | Gahgene Gweon
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Solving algebraic word problems has recently emerged as an important natural language processing task. To solve algebraic word problems, recent studies suggested neural models that generate solution equations by using ‘Op (operator/operand)’ tokens as a unit of input/output. However, such a neural model suffered two issues: expression fragmentation and operand-context separation. To address each of these two issues, we propose a pure neural model, Expression-Pointer Transformer (EPT), which uses (1) ‘Expression’ token and (2) operand-context pointers when generating solution equations. The performance of the EPT model is tested on three datasets: ALG514, DRAW-1K, and MAWPS. Compared to the state-of-the-art (SoTA) models, the EPT model achieved a comparable performance accuracy in each of the three datasets; 81.3% on ALG514, 59.5% on DRAW-1K, and 84.5% on MAWPS. The contribution of this paper is two-fold; (1) We propose a pure neural model, EPT, which can address the expression fragmentation and the operand-context separation. (2) The fully automatic EPT model, which does not use hand-crafted features, yields comparable performance to existing models using hand-crafted features, and achieves better performance than existing pure neural models by at most 40%.

Co-authors

Venues

Fix author