Gary Geunbae Lee

Also published as: Geunbae Lee


2023

pdf
Prompt- and Trait Relation-aware Cross-prompt Essay Trait Scoring
Heejin Do | Yunsu Kim | Gary Geunbae Lee
Findings of the Association for Computational Linguistics: ACL 2023

Automated essay scoring (AES) aims to score essays written for a given prompt, which defines the writing topic. Most existing AES systems assume to grade essays of the same prompt as used in training and assign only a holistic score. However, such settings conflict with real-education situations; pre-graded essays for a particular prompt are lacking, and detailed trait scores of sub-rubrics are required. Thus, predicting various trait scores of unseen-prompt essays (called cross-prompt essay trait scoring) is a remaining challenge of AES. In this paper, we propose a robust model: prompt- and trait relation-aware cross-prompt essay trait scorer. We encode prompt-aware essay representation by essay-prompt attention and utilizing the topic-coherence feature extracted by the topic-modeling mechanism without access to labeled data; therefore, our model considers the prompt adherence of an essay, even in a cross-prompt setting. To facilitate multi-trait scoring, we design trait-similarity loss that encapsulates the correlations of traits. Experiments prove the efficacy of our model, showing state-of-the-art results for all prompts and traits. Significant improvements in low-resource-prompt and inferior traits further indicate our model’s strength.

2022

pdf
Schema Encoding for Transferable Dialogue State Tracking
Hyunmin Jeon | Gary Geunbae Lee
Proceedings of the 29th International Conference on Computational Linguistics

Dialogue state tracking (DST) is an essential sub-task for task-oriented dialogue systems. Recent work has focused on deep neural models for DST. However, the neural models require a large dataset for training. Furthermore, applying them to another domain needs a new dataset because the neural models are generally trained to imitate the given dataset. In this paper, we propose Schema Encoding for Transferable Dialogue State Tracking (SET-DST), which is a neural DST method for effective transfer to new domains. Transferable DST could assist developments of dialogue systems even with few dataset on target domains. We use a schema encoder not just to imitate the dataset but to comprehend the schema of the dataset. We aim to transfer the model to new domains by encoding new schemas and using them for DST on multi-domain settings. As a result, SET-DST improved the joint accuracy by 1.46 points on MultiWOZ 2.1.

pdf
Conversational QA Dataset Generation with Answer Revision
Seonjeong Hwang | Gary Geunbae Lee
Proceedings of the 29th International Conference on Computational Linguistics

Conversational question-answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.

pdf
Multi-Type Conversational Question-Answer Generation with Closed-ended and Unanswerable Questions
Seonjeong Hwang | Yunsu Kim | Gary Geunbae Lee
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Conversational question answering (CQA) facilitates an incremental and interactive understanding of a given context, but building a CQA system is difficult for many domains due to the problem of data scarcity. In this paper, we introduce a novel method to synthesize data for CQA with various question types, including open-ended, closed-ended, and unanswerable questions. We design a different generation flow for each question type and effectively combine them in a single, shared framework. Moreover, we devise a hierarchical answerability classification (hierarchical AC) module that improves quality of the synthetic data while acquiring unanswerable questions. Manual inspections show that synthetic data generated with our framework have characteristics very similar to those of human-generated conversations. Across four domains, CQA systems trained on our synthetic data indeed show good performance close to the systems trained on human-annotated data.

2018

pdf
Out-of-domain Detection based on Generative Adversarial Network
Seonghan Ryu | Sangjun Koo | Hwanjo Yu | Gary Geunbae Lee
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

The main goal of this paper is to develop out-of-domain (OOD) detection for dialog systems. We propose to use only in-domain (IND) sentences to build a generative adversarial network (GAN) of which the discriminator generates low scores for OOD sentences. To improve basic GANs, we apply feature matching loss in the discriminator, use domain-category analysis as an additional task in the discriminator, and remove the biases in the generator. Thereby, we reduce the huge effort of collecting OOD sentences for training OOD detection. For evaluation, we experimented OOD detection on a multi-domain dialog system. The experimental results showed the proposed method was most accurate compared to the existing methods.

2015

pdf
Exploiting knowledge base to generate responses for natural language dialog listening agents
Sangdo Han | Jeesoo Bang | Seonghan Ryu | Gary Geunbae Lee
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
Conversational Knowledge Teaching Agent that uses a Knowledge Base
Kyusong Lee | Paul Hongsuck Seo | Junhwi Choi | Sangjun Koo | Gary Geunbae Lee
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
Question Answering System using Multiple Information Source and Open Type Answer Merge
Seonyeong Park | Soonchoul Kwon | Byungsoo Kim | Sangdo Han | Hyosup Shim | Gary Geunbae Lee
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2014

pdf
POSTECH Grammatical Error Correction System in the CoNLL-2014 Shared Task
Kyusong Lee | Gary Geunbae Lee
Proceedings of the Eighteenth Conference on Computational Natural Language Learning: Shared Task

2013

pdf
Counseling Dialog System with 5W1H Extraction
Sangdo Han | Kyusong Lee | Donghyeon Lee | Gary Geunbae Lee
Proceedings of the SIGDIAL 2013 Conference

2012

pdf
Grammatical Error Annotation for Korean Learners of Spoken English
Hongsuck Seo | Kyusong Lee | Gary Geunbae Lee | Soo-Ok Kweon | Hae-Ri Kim
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The goal of our research is to build a grammatical error-tagged corpus for Korean learners of Spoken English dubbed Postech Learner Corpus. We collected raw story-telling speech from Korean university students. Transcription and annotation using the Cambridge Learner Corpus tagset were performed by six Korean annotators fluent in English. For the annotation of the corpus, we developed an annotation tool and a validation tool. After comparing human annotation with machine-recommended error tags, unmatched errors were rechecked by a native annotator. We observed different characteristics between the spoken language corpus built in this study and an existing written language corpus.

pdf bib
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Haizhou Li | Chin-Yew Lin | Miles Osborne | Gary Geunbae Lee | Jong C. Park
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)
Haizhou Li | Chin-Yew Lin | Miles Osborne | Gary Geunbae Lee | Jong C. Park
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relation Extraction
Seokhwan Kim | Gary Geunbae Lee
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
A Meta Learning Approach to Grammatical Error Correction
Hongsuck Seo | Jonghoon Lee | Seokhwan Kim | Kyusong Lee | Sechun Kang | Gary Geunbae Lee
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
A Hierarchical Domain Model-Based Multi-Domain Selection Framework for Multi-Domain Dialog Systems
Seonghan Ryu | Donghyeon Lee | Injae Lee | Sangdo Han | Gary Geunbae Lee | Myungjae Kim | Kyungduk Kim
Proceedings of COLING 2012: Posters

pdf bib
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Gary Geunbae Lee | Jonathan Ginzburg | Claire Gardent | Amanda Stent
Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2011

pdf
A Cross-lingual Annotation Projection-based Self-supervision Approach for Open Information Extraction
Seokhwan Kim | Minwoo Jeong | Jonghoon Lee | Gary Geunbae Lee
Proceedings of 5th International Joint Conference on Natural Language Processing

pdf
POMY: A Conversational Virtual Environment for Language Learning in POSTECH
Hyungjong Noh | Kyusong Lee | Sungjin Lee | Gary Geunbae Lee
Proceedings of the SIGDIAL 2011 Conference

2010

pdf
A Cross-lingual Annotation Projection Approach for Relation Detection
Seokhwan Kim | Minwoo Jeong | Jonghoon Lee | Gary Geunbae Lee
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf
Automatic Agenda Graph Construction from Human-Human Dialogs using Clustering Method
Cheongjae Lee | Sangkeun Jung | Kyungduk Kim | Gary Geunbae Lee
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf
A Local Tree Alignment-based Soft Pattern Matching Approach for Information Extraction
Seokhwan Kim | Minwoo Jeong | Gary Geunbae Lee
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf
Hybrid Approach to User Intention Modeling for Dialog Simulation
Sangkeun Jung | Cheongjae Lee | Kyungduk Kim | Gary Geunbae Lee
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf
Realistic Grammar Error Simulation using Markov Logic
Sungjin Lee | Gary Geunbae Lee
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf
Efficient Inference of CRFs for Large-Scale Natural Language Data
Minwoo Jeong | Chin-Yew Lin | Gary Geunbae Lee
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations
Gary Geunbae Lee | Sabine Schulte im Walde
Proceedings of the ACL-IJCNLP 2009 Software Demonstrations

pdf
Semi-supervised Speech Act Recognition in Emails and Forums
Minwoo Jeong | Chin-Yew Lin | Gary Geunbae Lee
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

2008

pdf bib
Transformation-based Sentence Splitting method for Statistical Machine Translation
Jonghoon Lee | Donghyeon Lee | Gary Geunbae Lee
Proceedings of the Workshop on Technologies and Corpora for Asia-Pacific Speech Translation (TCAST)

pdf
Robust Dialog Management with N-Best Hypotheses Using Dialog Examples and Agenda
Cheongjae Lee | Sangkeun Jung | Gary Geunbae Lee
Proceedings of ACL-08: HLT

pdf
A Frame-Based Probabilistic Framework for Spoken Dialog Management Using Dialog Examples
Kyungduk Kim | Cheongjae Lee | Sangkeun Jung | Gary Geunbae Lee
Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue

pdf
An Integrated Dialog Simulation Technique for Evaluating Spoken Dialog Systems
Sangkeun Jung | Cheongjae Lee | Kyungduk Kim | Gary Geunbae Lee
Coling 2008: Proceedings of the workshop on Speech Processing for Safety Critical Translation and Pervasive Applications

pdf
POSTECH machine translation system for IWSLT 2008 evaluation campaign.
Jonghoon Lee | Gary Geunbae Lee
Proceedings of the 5th International Workshop on Spoken Language Translation: Evaluation Campaign

In this paper, we describe POSTECH system for IWSLT 2008 evaluation campaign. The system is based on phrase based statistical machine translation. We set up a baseline system using well known freely available software. A preprocessing method and a language modeling method have been applied to the baseline system in order to improve machine translation quality. The preprocessing method is to identify and remove useless tokens in source texts. And the language modeling method models phrase level n-gram. We have participated in the BTEC tasks to see the effects of our methods.

2007

pdf
A Joint Statistical Model for Simultaneous Word Spacing and Spelling Error Correction for Korean
Hyungjong Noh | Jeong-Won Cha | Gary Geunbae Lee
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

pdf
POSSLT: A Korean to English Spoken Language Translation System
Donghyeon Lee | Jonghoon Lee | Gary Geunbae Lee
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT)

2006

pdf
Exploiting Non-Local Features for Spoken Language Understanding
Minwoo Jeong | Gary Geunbae Lee
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf
MMR-based Active Machine Learning for Bio Named Entity Recognition
Seokhwan Kim | Yu Song | Kyungduk Kim | Jeong-Won Cha | Gary Geunbae Lee
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

2005

pdf
POSBIOTM/W: A Development Workbench for Machine Learning Oriented Biomedical Text Mining System
Kyungduk Kim | Yu Song | Gary Geunbae Lee
Proceedings of HLT/EMNLP 2005 Interactive Demonstrations

pdf
Heuristic Methods for Reducing Errors of Geographic Named Entities Learned by Bootstrapping
Seungwoo Lee | Gary Geunbae Lee
Second International Joint Conference on Natural Language Processing: Full Papers

2004

pdf bib
MMR-based Feature Selection for Text Categorization
Changki Lee | Gary Geunbae Lee
Proceedings of HLT-NAACL 2004: Short Papers

pdf
POSBIOTM-NER in the Shared Task of BioNLP/NLPBA2004
Yu Song | Eunju Kim | Gary Geunbae Lee | Byoung-kee Yi
Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP)

pdf
Using Higher-level Linguistic Knowledge for Speech Recognition Error Correction in a Spoken Q/A Dialog
Minwoo Jeong | Byeongchang Kim | Gary Geunbae Lee
Proceedings of the HLT-NAACL 2004 Workshop on Spoken Language Understanding for Conversational Systems and Higher Level Linguistic Information for Speech Processing

2003

pdf
Automatic Acquisition of Named Entity Tagged Corpus from World Wide Web
Joohui An | Seungwoo Lee | Gary Geunbae Lee
The Companion Volume to the Proceedings of 41st Annual Meeting of the Association for Computational Linguistics

2002

pdf bib
Multilingual Question Answering with High Portability on Relational Databases
Hanmin Jung | Gary Geunbae Lee
COLING-02: Multilingual Summarization and Question Answering

pdf
Syllable-Pattern-Based Unknown-Morpheme Segmentation and Estimation for Hybrid Part-of-Speech Tagging of Korean
Gary Geunbae Lee | Jeongwon Cha | Jong-Hyeok Lee
Computational Linguistics, Volume 28, Number 1, March 2002

2001

pdf
Automatic Corpus-based Tone Prediction using K-ToBI Representation
Jin-Seok Lee | Byeongchang Kim | Gary Geunbae Lee
Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing

pdf bib
MAYA: A Fast Question-answering System Based on a Predictive Answer Indexer
Harksoo Kim | Kyungsun Kim | Gary Geunbae Lee | Jungyun Seo
Proceedings of the ACL 2001 Workshop on Open-Domain Question Answering

2000

pdf
Structural disambiguation of morpho-syntactic categorial parsing for Korean
Jeongwon Cha | Geunbae Lee
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf
Decision-Tree based Error Correction for Statistical Phrase Break Prediction in Korean
Byeongchang Kim | Geunbae Lee
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

pdf
POSCAT: A Morpheme-based Speech Corpus Annotation Tool
Byeongchang Kim | Jin-seok Lee | Jeongwon Cha | Geunbae Lee
Proceedings of the Second International Conference on Language Resources and Evaluation (LREC’00)

pdf
Corpus-Based Learning of Compound Noun Indexing
Byung-Kwan Kwak | Jee-Hyub Kim | Geunbae Lee | Jung Yun Seo
ACL-2000 Workshop on Recent Advances in Natural Language Processing and Information Retrieval

1998

pdf
Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS
Byeongchang Kim | WonIl Lee | Geunbae Lee | Jong-Hyeok Lee
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf
Identifying Syntactic Role of Antecedent in Korean Relative Clause using Corpus and Thesaurus Informationes
Hui-Feng Li | Jong-Hyeok Lee | Geunbae Lee
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

pdf
Generalized unknown morpheme guessing for hybrid POS tagging of Korean
Jeongwon Cha | Geunbae Lee | Jong-Hyeok Lee
Sixth Workshop on Very Large Corpora

pdf
Unlimited Vocabulary Grapheme to Phoneme Conversion for Korean TTS
Byeongchang Kim | WonIl Lee | Geunbae Lee | Jong-Hyeok Lee
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

pdf
Identifying Syntactic Role of Antecedent in Korean Relative Clause Using Corpus and Thesaurus Information
Hui-Feng Li | Jong-Hyeok Lee | Geunbae Lee
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

1994

pdf
Table-driven Neural Syntactic Analysis of Spoken Korean
Wonll Lee | Geunbae Lee | Jong-Hyeok Lee
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics