Keelan Evanini


2019

pdf
Using Rhetorical Structure Theory to Assess Discourse Coherence for Non-native Spontaneous Speech
Xinhao Wang | Binod Gyawali | James V. Bruno | Hillary R. Molloy | Keelan Evanini | Klaus Zechner
Proceedings of the Workshop on Discourse Relation Parsing and Treebanking 2019

This study aims to model the discourse structure of spontaneous spoken responses within the context of an assessment of English speaking proficiency for non-native speakers. Rhetorical Structure Theory (RST) has been commonly used in the analysis of discourse organization of written texts; however, limited research has been conducted to date on RST annotation and parsing of spoken language, in particular, non-native spontaneous speech. Due to the fact that the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spoken language, we conducted research to obtain RST annotations on non-native spoken responses from a standardized assessment of academic English proficiency. Subsequently, automatic parsers were trained on these annotations to process non-native spontaneous speech. Finally, a set of features were extracted from automatically generated RST trees to evaluate the discourse structure of non-native spontaneous speech, which were then employed to further improve the validity of an automated speech scoring system.

pdf
Application of an Automatic Plagiarism Detection System in a Large-scale Assessment of English Speaking Proficiency
Xinhao Wang | Keelan Evanini | Matthew Mulholland | Yao Qian | James V. Bruno
Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications

This study aims to build an automatic system for the detection of plagiarized spoken responses in the context of an assessment of English speaking proficiency for non-native speakers. Classification models were trained to distinguish between plagiarized and non-plagiarized responses with two different types of features: text-to-text content similarity measures, which are commonly used in the task of plagiarism detection for written documents, and speaking proficiency measures, which were specifically designed for spontaneous speech and extracted using an automated speech scoring system. The experiments were first conducted on a large data set drawn from an operational English proficiency assessment across multiple years, and the best classifier on this heavily imbalanced data set resulted in an F1-score of 0.761 on the plagiarized class. This system was then validated on operational responses collected from a single administration of the assessment and achieved a recall of 0.897. The results indicate that the proposed system can potentially be used to improve the validity of both human and automated assessment of non-native spoken English.

2017

pdf
A Report on the 2017 Native Language Identification Shared Task
Shervin Malmasi | Keelan Evanini | Aoife Cahill | Joel Tetreault | Robert Pugh | Christopher Hamill | Diane Napolitano | Yao Qian
Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications

Native Language Identification (NLI) is the task of automatically identifying the native language (L1) of an individual based on their language production in a learned language. It is typically framed as a classification task where the set of L1s is known a priori. Two previous shared tasks on NLI have been organized where the aim was to identify the L1 of learners of English based on essays (2013) and spoken responses (2016) they provided during a standardized assessment of academic English proficiency. The 2017 shared task combines the inputs from the two prior tasks for the first time. There are three tracks: NLI on the essay only, NLI on the spoken response only (based on a transcription of the response and i-vector acoustic features), and NLI using both responses. We believe this makes for a more interesting shared task while building on the methods and results from the previous two shared tasks. In this paper, we report the results of the shared task. A total of 19 teams competed across the three different sub-tasks. The fusion track showed that combining the written and spoken responses provides a large boost in prediction accuracy. Multiple classifier systems (e.g. ensembles and meta-classifiers) were the most effective in all tasks, with most based on traditional classifiers (e.g. SVMs) with lexical/syntactic features.

pdf
Discourse Annotation of Non-native Spontaneous Spoken Responses Using the Rhetorical Structure Theory Framework
Xinhao Wang | James Bruno | Hillary Molloy | Keelan Evanini | Klaus Zechner
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

The availability of the Rhetorical Structure Theory (RST) Discourse Treebank has spurred substantial research into discourse analysis of written texts; however, limited research has been conducted to date on RST annotation and parsing of spoken language, in particular, non-native spontaneous speech. Considering that the measurement of discourse coherence is typically a key metric in human scoring rubrics for assessments of spoken language, we initiated a research effort to obtain RST annotations of a large number of non-native spoken responses from a standardized assessment of academic English proficiency. The resulting inter-annotator kappa agreements on the three different levels of Span, Nuclearity, and Relation are 0.848, 0.766, and 0.653, respectively. Furthermore, a set of features was explored to evaluate the discourse structure of non-native spontaneous speech based on these annotations; the highest performing feature resulted in a correlation of 0.612 with scores of discourse coherence provided by expert human raters.

2015

pdf
Automated Speech Recognition Technology for Dialogue Interaction with Non-Native Interlocutors
Alexei V. Ivanov | Vikram Ramanarayanan | David Suendermann-Oeft | Melissa Lopez | Keelan Evanini | Jidong Tao
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
A distributed cloud-based dialog system for conversational application development
Vikram Ramanarayanan | David Suendermann-Oeft | Alexei V. Ivanov | Keelan Evanini
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

2014

pdf
Automatic detection of plagiarized spoken responses
Keelan Evanini | Xinhao Wang
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications

pdf
Automated scoring of speaking items in an assessment for teachers of English as a Foreign Language
Klaus Zechner | Keelan Evanini | Su-Youn Yoon | Lawrence Davis | Xinhao Wang | Lei Chen | Chong Min Lee | Chee Wee Leong
Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications

2013

pdf
Coherence Modeling for the Automated Assessment of Spontaneous Spoken Responses
Xinhao Wang | Keelan Evanini | Klaus Zechner
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Prompt-based Content Scoring for Automated Spoken Language Assessment
Keelan Evanini | Shasha Xie | Klaus Zechner
Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications

2012

pdf
Exploring Content Features for Automated Speech Scoring
Shasha Xie | Keelan Evanini | Klaus Zechner
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2011

pdf
Non-scorable Response Detection for Automated Speaking Proficiency Assessment
Su-Youn Yoon | Keelan Evanini | Klaus Zechner
Proceedings of the Sixth Workshop on Innovative Use of NLP for Building Educational Applications

2010

pdf
Using Amazon Mechanical Turk for Transcription of Non-Native Speech
Keelan Evanini | Derrick Higgins | Klaus Zechner
Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk