Chungmin Lee

Also published as: Chong Min Lee, Chung-min Lee

2024

pdf abs
What GPT-4 Knows about Aspectual Coercion: Focused on “Begin the Book”
Seohyun Im | Chungmin Lee
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

This paper explores whether Pre-trained Large Language Models (PLLMs) like GPT-4 can grasp profound linguistic insights into language phenomena such as Aspectual Coercion through interaction with Microsoft’s Copilot, which integrates GPT-4. Firstly, we examined Copilot’s understanding of the co-occurrence constraints of the aspectual verb “begin” and the complex-type noun “book” using the classic illustration of Aspectual Coercion, “begin the book.” Secondly, we verified Copilot’s awareness of both the default interpretation of “begin the book” with no specific context and the contextually preferred interpretation. Ultimately, Copilot provided appropriate responses regarding potential interpretations of “begin the book” based on its distributional properties and context-dependent preferred interpretations. However, it did not furnish sophisticated explanations concerning these interpretations from a linguistic theoretical perspective. On the other hand, by offering diverse interpretations grounded in distributional properties, language models like GPT-4 demonstrated their potential contribution to the refinement of linguistic theories. Furthermore, we suggested the feasibility of employing Language Models to construct language resources associated with language phenomena including Aspectual Coercion.

2019

pdf abs
Content Modeling for Automated Oral Proficiency Scoring System
Su-Youn Yoon | Chong Min Lee
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

We developed an automated oral proficiency scoring system for non-native English speakers’ spontaneous speech. Automated systems that score holistic proficiency are expected to assess a wide range of performance categories, and the content is one of the core performance categories. In order to assess the quality of the content, we trained a Siamese convolutional neural network (CNN) to model the semantic relationship between key points generated by experts and a test response. The correlation between human scores and Siamese CNN scores was comparable to human-human agreement (r=0.63), and it was higher than the baseline content features. The inclusion of Siamese CNN-based feature to the existing state-of-the-art automated scoring model achieved a small but statistically significant improvement. However, the new model suffered from score inflation for long atypical responses with serious content issues. We investigated the reasons of this score inflation by analyzing the associations with linguistic features and identifying areas strongly associated with the score errors.

2018

In this study, we develop content features for an automated scoring system of non-native English speakers’ spontaneous speech. The features calculate the lexical similarity between the question text and the ASR word hypothesis of the spoken response, based on traditional word vector models or word embeddings. The proposed features do not require any sample training responses for each question, and this is a strong advantage since collecting question-specific data is an expensive task, and sometimes even impossible due to concerns about question exposure. We explore the impact of these new features on the automated scoring of two different question types: (a) providing opinions on familiar topics and (b) answering a question about a stimulus material. The proposed features showed statistically significant correlations with the oral proficiency scores, and the combination of new features with the speech-driven features achieved a small but significant further improvement for the latter question type. Further analyses suggested that the new features were effective in assigning more accurate scores for responses with serious content issues.

2017

pdf abs
Predicting Audience’s Laughter During Presentations Using Convolutional Neural Network
Lei Chen | Chong Min Lee
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

Public speakings play important roles in schools and work places and properly using humor contributes to effective presentations. For the purpose of automatically evaluating speakers’ humor usage, we build a presentation corpus containing humorous utterances based on TED talks. Compared to previous data resources supporting humor recognition research, ours has several advantages, including (a) both positive and negative instances coming from a homogeneous data set, (b) containing a large number of speakers, and (c) being open. Focusing on using lexical cues for humor recognition, we systematically compare a newly emerging text classification method based on Convolutional Neural Networks (CNNs) with a well-established conventional method using linguistic knowledge. The advantages of the CNN method are both getting higher detection accuracies and being able to learn essential features automatically.

pdf abs
Investigating neural architectures for short answer scoring
Brian Riordan | Andrea Horbach | Aoife Cahill | Torsten Zesch | Chong Min Lee
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

Neural approaches to automated essay scoring have recently shown state-of-the-art performance. The automated essay scoring task typically involves a broad notion of writing quality that encompasses content, grammar, organization, and conventions. This differs from the short answer content scoring task, which focuses on content accuracy. The inputs to neural essay scoring models – ngrams and embeddings – are arguably well-suited to evaluate content in short answer scoring tasks. We investigate how several basic neural approaches similar to those used for automated essay scoring perform on short answer scoring. We show that neural architectures can outperform a strong non-neural baseline, but performance and optimal parameter settings vary across the more diverse types of prompts typical of short answer scoring.

2016

pdf
The Cloud of Knowing: Non-factive al-ta ‘know’ (as a Neg-raiser) in Korean
Chungmin Lee | Seungjin Hong
Proceedings of the 30th Pacific Asia Conference on Language, Information and Computation: Posters

pdf abs
Can We Make Computers Laugh at Talks?
Chong Min Lee | Su-Youn Yoon | Lei Chen
Proceedings of the Workshop on Computational Modeling of People’s Opinions, Personality, and Emotions in Social Media (PEOPLES)

Considering the importance of public speech skills, a system which makes a prediction on where audiences laugh in a talk can be helpful to a person who prepares for a talk. We investigated a possibility that a state-of-the-art humor recognition system can be used in detecting sentences inducing laughters in talks. In this study, we used TED talks and laughters in the talks as data. Our results showed that the state-of-the-art system needs to be improved in order to be used in a practical application. In addition, our analysis showed that classifying humorous sentences in talks is very challenging due to close distance between humorous and non-humorous sentences.

2015

pdf
Automated Scoring of Picture-based Story Narration
Swapna Somasundaran | Chong Min Lee | Martin Chodorow | Xinhao Wang
Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications

2014

We develop a method for detecting errors in semantic predicate-argument annotation, based on the variation n-gram error detection method. After establishing an appropriate data representation, we detect inconsistencies by searching for identical text with varying annotation. By remaining data-driven, we are able to detect inconsistencies arising from errors at lower layers of annotation.