Michael Flor


2024

pdf
Three Studies on Predicting Word Concreteness with Embedding Vectors
Michael Flor
Proceedings of the Workshop on Cognitive Aspects of the Lexicon @ LREC-COLING 2024

Human-assigned concreteness ratings for words are commonly used in psycholinguistic and computational linguistic studies. Previous research has shown that such ratings can be modeled and extrapolated by using dense word-embedding representations. However, due to rater disagreement, considerable amounts of human ratings in published datasets are not reliable. We investigate how such unreliable data influences modeling of concreteness with word embeddings. Study 1 compares fourteen embedding models over three datasets of concreteness ratings, showing that most models achieve high correlations with human ratings, and exhibit low error rates on predictions. Study 2 investigates how exclusion of the less reliable ratings influences the modeling results. It indicates that improved results can be achieved when data is cleaned. Study 3 adds additional conditions over those of study 2 and indicates that the improved results hold only for the cleaned data, and that in the general case removing the less reliable data points is not useful.

2023

pdf
Using Neural Machine Translation for Generating Diverse Challenging Exercises for Language Learner
Frank Palma Gomez | Subhadarshi Panda | Michael Flor | Alla Rozovskaya
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

We propose a novel approach to automatically generate distractors for cloze exercises for English language learners, using round-trip neural machine translation. A carrier sentence is translated from English into another (pivot) language and back, and distractors are produced by aligning the original sentence with its round-trip translation. We make use of 16 linguistically-diverse pivots and generate hundreds of translation hypotheses in each direction. We show that using hundreds of translations allows us to generate a rich set of challenging distractors. Moreover, we find that typologically unrelated language pivots contribute more diverse candidate distractors, compared to language pivots that are closely related. We further evaluate the use of machine translation systems of varying quality and find that better quality MT systems produce more challenging distractors. Finally, we conduct a study with language learners, demonstrating that the automatically generated distractors are of the same difficulty as the gold distractors produced by human experts.

2022

pdf
Automatic Generation of Distractors for Fill-in-the-Blank Exercises with Round-Trip Neural Machine Translation
Subhadarshi Panda | Frank Palma Gomez | Michael Flor | Alla Rozovskaya
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

In a fill-in-the-blank exercise, a student is presented with a carrier sentence with one word hidden, and a multiple-choice list that includes the correct answer and several inappropriate options, called distractors. We propose to automatically generate distractors using round-trip neural machine translation: the carrier sentence is translated from English into another (pivot) language and back, and distractors are produced by aligning the original sentence and its round-trip translation. We show that using hundreds of translations for a given sentence allows us to generate a rich set of challenging distractors. Further, using multiple pivot languages produces a diverse set of candidates. The distractors are evaluated against a real corpus of cloze exercises and checked manually for validity. We demonstrate that the proposed method significantly outperforms two strong baselines.

2020

pdf
Emotion Arcs of Student Narratives
Swapna Somasundaran | Xianyang Chen | Michael Flor
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events

This paper studies emotion arcs in student narratives. We construct emotion arcs based on event affect and implied sentiments, which correspond to plot elements in the story. We show that student narratives can show elements of plot structure in their emotion arcs and that properties of these arcs can be useful indicators of narrative quality. We build a system and perform analysis to show that our arc-based features are complementary to previously studied sentiment features in this area.

pdf
Go Figure! Multi-task transformer-based architecture for metaphor detection using idioms: ETS team in 2020 metaphor shared task
Xianyang Chen | Chee Wee (Ben) Leong | Michael Flor | Beata Beigman Klebanov
Proceedings of the Second Workshop on Figurative Language Processing

This paper describes the ETS entry to the 2020 Metaphor Detection shared task. Our contribution consists of a sequence of experiments using BERT, starting with a baseline, strengthening it by spell-correcting the TOEFL corpus, followed by a multi-task learning setting, where one of the tasks is the token-level metaphor classification as per the shared task, while the other is meant to provide additional training that we hypothesized to be relevant to the main task. In one case, out-of-domain data manually annotated for metaphor is used for the auxiliary task; in the other case, in-domain data automatically annotated for idioms is used for the auxiliary task. Both multi-task experiments yield promising results.

2019

pdf
My Turn To Read: An Interleaved E-book Reading Tool for Developing and Struggling Readers
Nitin Madnani | Beata Beigman Klebanov | Anastassia Loukina | Binod Gyawali | Patrick Lange | John Sabatini | Michael Flor
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Literacy is crucial for functioning in modern society. It underpins everything from educational attainment and employment opportunities to health outcomes. We describe My Turn To Read, an app that uses interleaved reading to help developing and struggling readers improve reading skills while reading for meaning and pleasure. We hypothesize that the longer-term impact of the app will be to help users become better, more confident readers with an increased stamina for extended reading. We describe the technology and present preliminary evidence in support of this hypothesis.

pdf
Lexical concreteness in narrative
Michael Flor | Swapna Somasundaran
Proceedings of the Second Workshop on Storytelling

This study explores the relation between lexical concreteness and narrative text quality. We present a methodology to quantitatively measure lexical concreteness of a text. We apply it to a corpus of student stories, scored according to writing evaluation rubrics. Lexical concreteness is weakly-to-moderately related to story quality, depending on story-type. The relation is mostly borne by adjectives and nouns, but also found for adverbs and verbs.

pdf
A Benchmark Corpus of English Misspellings and a Minimally-supervised Model for Spelling Correction
Michael Flor | Michael Fried | Alla Rozovskaya
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Spelling correction has attracted a lot of attention in the NLP community. However, models have been usually evaluated on artificiallycreated or proprietary corpora. A publiclyavailable corpus of authentic misspellings, annotated in context, is still lacking. To address this, we present and release an annotated data set of 6,121 spelling errors in context, based on a corpus of essays written by English language learners. We also develop a minimallysupervised context-aware approach to spelling correction. It achieves strong results on our data: 88.12% accuracy. This approach can also train with a minimal amount of annotated data (performance reduced by less than 1%). Furthermore, this approach allows easy portability to new domains. We evaluate our model on data from a medical domain and demonstrate that it rivals the performance of a model trained and tuned on in-domain data.

pdf
How to account for mispellings: Quantifying the benefit of character representations in neural content scoring models
Brian Riordan | Michael Flor | Robert Pugh
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

Character-based representations in neural models have been claimed to be a tool to overcome spelling variation in in word token-based input. We examine this claim in neural models for content scoring. We formulate precise hypotheses about the possible effects of adding character representations to word-based models and test these hypotheses on large-scale real world content scoring datasets. We find that, while character representations may provide small performance gains in general, their effectiveness in accounting for spelling variation may be limited. We show that spelling correction can provide larger gains than character representations, and that spelling correction improves the performance of models with character representations. With these insights, we report a new state of the art on the ASAP-SAS content scoring dataset.

2018

pdf
A Corpus of Non-Native Written English Annotated for Metaphor
Beata Beigman Klebanov | Chee Wee (Ben) Leong | Michael Flor
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

We present a corpus of 240 argumentative essays written by non-native speakers of English annotated for metaphor. The corpus is made publicly available. We provide benchmark performance of state-of-the-art systems on this new corpus, and explore the relationship between writing proficiency and metaphor use.

pdf
Towards Evaluating Narrative Quality In Student Writing
Swapna Somasundaran | Michael Flor | Martin Chodorow | Hillary Molloy | Binod Gyawali | Laura McCulla
Transactions of the Association for Computational Linguistics, Volume 6

This work lays the foundation for automated assessments of narrative quality in student writing. We first manually score essays for narrative-relevant traits and sub-traits, and measure inter-annotator agreement. We then explore linguistic features that are indicative of good narrative writing and use them to build an automated scoring system. Experiments show that our features are more effective in scoring specific aspects of narrative quality than a state-of-the-art feature set.

pdf
A Semantic Role-based Approach to Open-Domain Automatic Question Generation
Michael Flor | Brian Riordan
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

We present a novel rule-based system for automatic generation of factual questions from sentences, using semantic role labeling (SRL) as the main form of text analysis. The system is capable of generating both wh-questions and yes/no questions from the same semantic analysis. We present an extensive evaluation of the system and compare it to a recent neural network architecture for question generation. The SRL-based system outperforms the neural system in both average quality and variety of generated questions.

pdf
Catching Idiomatic Expressions in EFL Essays
Michael Flor | Beata Beigman Klebanov
Proceedings of the Workshop on Figurative Language Processing

This paper presents an exploratory study on large-scale detection of idiomatic expressions in essays written by non-native speakers of English. We describe a computational search procedure for automatic detection of idiom-candidate phrases in essay texts. The study used a corpus of essays written during a standardized examination of English language proficiency. Automatically-flagged candidate expressions were manually annotated for idiomaticity. The study found that idioms are widely used in EFL essays. The study also showed that a search algorithm that accommodates the syntactic and lexical exibility of idioms can increase the recall of idiom instances by 30%, but it also increases the amount of false positives.

2017

pdf
Sentiment Analysis and Lexical Cohesion for the Story Cloze Task
Michael Flor | Swapna Somasundaran
Proceedings of the 2nd Workshop on Linking Models of Lexical, Sentential and Discourse-level Semantics

We present two NLP components for the Story Cloze Task – dictionary-based sentiment analysis and lexical cohesion. While previous research found no contribution from sentiment analysis to the accuracy on this task, we demonstrate that sentiment is an important aspect. We describe a new approach, using a rule that estimates sentiment congruence in a story. Our sentiment-based system achieves strong results on this task. Our lexical cohesion system achieves accuracy comparable to previously published baseline results. A combination of the two systems achieves better accuracy than published baselines. We argue that sentiment analysis should be considered an integral part of narrative comprehension.

2016

pdf
Automated classification of collaborative problem solving interactions in simulated science tasks
Michael Flor | Su-Youn Yoon | Jiangang Hao | Lei Liu | Alina von Davier
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Topicality-Based Indices for Essay Scoring
Beata Beigman Klebanov | Michael Flor | Binod Gyawali
Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Semantic classifications for detection of verb metaphors
Beata Beigman Klebanov | Chee Wee Leong | E. Dario Gutierrez | Ekaterina Shutova | Michael Flor
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2015

pdf bib
Supervised Word-Level Metaphor Detection: Experiments with Concreteness and Reweighting of Examples
Beata Beigman Klebanov | Chee Wee Leong | Michael Flor
Proceedings of the Third Workshop on Metaphor in NLP

2014

pdf bib
Different Texts, Same Metaphors: Unigrams and Beyond
Beata Beigman Klebanov | Ben Leong | Michael Heilman | Michael Flor
Proceedings of the Second Workshop on Metaphor in NLP

pdf
ETS Lexical Associations System for the COGALEX-4 Shared Task
Michael Flor | Beata Beigman Klebanov
Proceedings of the 4th Workshop on Cognitive Aspects of the Lexicon (CogALex)

2013

pdf
Word Association Profiles and their Use for Automated Scoring of Essays
Beata Beigman Klebanov | Michael Flor
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Argumentation-Relevant Metaphors in Test-Taker Essays
Beata Beigman Klebanov | Michael Flor
Proceedings of the First Workshop on Metaphor in NLP

pdf
Lexical Tightness and Text Complexity
Michael Flor | Beata Beigman Klebanov | Kathleen M. Sheehan
Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility

pdf
A Two-Stage Approach for Generating Unbiased Estimates of Text Complexity
Kathleen M. Sheehan | Michael Flor | Diane Napolitano
Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility

pdf
Associative Texture Is Lost In Translation
Beata Beigman Klebanov | Michael Flor
Proceedings of the Workshop on Discourse in Machine Translation

2012

pdf
Four types of context for automatic spelling correction
Michael Flor
Traitement Automatique des Langues, Volume 53, Numéro 3 : Du bruit dans le signal : gestion des erreurs en traitement automatique des langues [Managing noise in the signal: Error handling in natural language processing]

pdf
On using context for automatic correction of non-word misspellings in student essays
Michael Flor | Yoko Futagi
Proceedings of the Seventh Workshop on Building Educational Applications Using NLP