Jeongwoo Ko

2020

Understanding emotion expressed in language has a wide range of applications, from building empathetic chatbots to detecting harmful online behavior. Advancement in this area can be improved using large-scale datasets with a fine-grained typology, adaptable to multiple downstream tasks. We introduce GoEmotions, the largest manually annotated dataset of 58k English Reddit comments, labeled for 27 emotion categories or Neutral. We demonstrate the high quality of the annotations via Principal Preserved Component Analysis. We conduct transfer learning experiments with existing emotion benchmarks to show that our dataset generalizes well to other domains and different emotion taxonomies. Our BERT-based model achieves an average F1-score of .46 across our proposed taxonomy, leaving much room for improvement.

2007

pdf
Language-independent Probabilistic Answer Ranking for Question Answering
Jeongwoo Ko | Teruko Mitamura | Eric Nyberg
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf
A Probabilistic Framework for Answer Selection in Question Answering
Jeongwoo Ko | Luo Si | Eric Nyberg
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Proceedings of the Main Conference

2006

pdf abs
Exploiting Multiple Semantic Resources for Answer Selection
Jeongwoo Ko | Laurie Hiyakumoto | Eric Nyberg
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the utility of semantic resources such as the Web, WordNet and gazetteers in the answer selection process for a question-answering system. In contrast with previous work using individual semantic resources to support answer selection, our work combines multiple resources to boost the confidence scores assigned to correct answers and evaluates different combination strategies based on unweighted sums, weighted linear combinations, and logistic regression. We apply our approach to select answers from candidates produced by three different extraction techniques of varying quality, focusing on TREC questions whose answers represent locations or proper-names. Our experimental results demonstrate that the combination of semantic resources is more effective than individual resources for all three extraction techniques, improving answer selection accuracy by as much as 32.35% for location questions and 72% for proper-name questions. Of the combination strategies tested, logistic regression models produced the best results for both location and proper-name questions.

pdf abs
Analyzing the Effects of Spoken Dialog Systems on Driving Behavior
Jeongwoo Ko | Fumihiko Murase | Teruko Mitamura | Eric Nyberg | Masahiko Tateishi | Ichiro Akahori
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper presents an evaluation of a spoken dialog system for automotive environments. Our overall goal was to measure the impact of user-system interaction on the users driving performance, and to determine whether adding context-awareness to the dialog system might reduce the degree of user distraction during driving. To address this issue, we incorporated context-awareness into a spoken dialog system, and implemented three system features using user context, network context and dialog context. A series of experiments were conducted under three different configurations: driving without a dialog system, driving while using a context-aware dialog system, and driving while using a context-unaware dialog system. We measured the differences between the three configurations by comparing the average car speed, the frequency of speed changes and the angle between the cars direction and the centerline on the road. These results indicate that context-awareness could reduce the degree of user distraction when using a dialog system during driving.

2004

pdf
An Information Repository Model for Advanced Question Answering Systems
Vasco Calais Pedro | Jeongwoo Ko | Eric Nyberg | Teruko Mitamura
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

2003

This paper presents an overview of the tools provided by KANTOO MT system for controlled source language checking, source text analysis, and terminology management. The steps in each process are described, and screen images are provided to illustrate the system architecture and example tool interfaces.

2002

pdf abs
The KANTOO MT sytem: controlled language checker and lexical maintenance tool
Teriuko Mitamura | Eric Nyberg | Kathy Baker | Peter Cramer | Jeongwoo Ko | David Svoboda | Michael Duggan
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: System Descriptions

We will present the KANTOO machine translation environment, a set of software servers and tools for multilingual document production. KANTOO includes modules for source language analysis, target language generation, source terminology management, target terminology management, and knowledge source development (see Figure 1).