Vikram Ramanarayanan


On the Utility of Audiovisual Dialog Technologies and Signal Analytics for Real-time Remote Monitoring of Depression Biomarkers
Michael Neumann | Oliver Roessler | David Suendermann-Oeft | Vikram Ramanarayanan
Proceedings of the First Workshop on Natural Language Processing for Medical Conversations

We investigate the utility of audiovisual dialog systems combined with speech and video analytics for real-time remote monitoring of depression at scale in uncontrolled environment settings. We collected audiovisual conversational data from participants who interacted with a cloud-based multimodal dialog system, and automatically extracted a large set of speech and vision metrics based on the rich existing literature of laboratory studies. We report on the efficacy of various audio and video metrics in differentiating people with mild, moderate and severe depression, and discuss the implications of these results for the deployment of such technologies in real-world neurological diagnosis and monitoring applications.


Scoring Interactional Aspects of Human-Machine Dialog for Language Learning and Assessment using Text Features
Vikram Ramanarayanan | Matthew Mulholland | Yao Qian
Proceedings of the 20th Annual SIGdial Meeting on Discourse and Dialogue

While there has been much work in the language learning and assessment literature on human and automated scoring of essays and short constructed responses, there is little to no work examining text features for scoring of dialog data, particularly interactional aspects thereof, to assess conversational proficiency over and above constructed response skills. Our work bridges this gap by investigating both human and automated approaches towards scoring human–machine text dialog in the context of a real-world language learning application. We collected conversational data of human learners interacting with a cloud-based standards-compliant dialog system, triple-scored these data along multiple dimensions of conversational proficiency, and then analyzed the performance trends. We further examined two different approaches to automated scoring of such data and show that these approaches are able to perform at or above par with human agreement for a majority of dimensions of the scoring rubric.


Toward Automatically Measuring Learner Ability from Human-Machine Dialog Interactions using Novel Psychometric Models
Vikram Ramanarayanan | Michelle LaMar
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications

While dialog systems have been widely deployed for computer-assisted language learning (CALL) and formative assessment systems in recent years, relatively limited work has been done with respect to the psychometrics and validity of these technologies in evaluating and providing feedback regarding student learning and conversational ability. This paper formulates a Markov decision process based measurement model, and applies it to text chat data collected from crowdsourced native and non-native English language speakers interacting with an automated dialog agent. We investigate how well the model measures speaker conversational ability, and find that it effectively captures the differences in how native and non-native speakers of English accomplish the dialog task. Such models could have important implications for CALL systems of the future that effectively combine dialog management with measurement of learner conversational ability in real-time.

Automatic Token and Turn Level Language Identification for Code-Switched Text Dialog: An Analysis Across Language Pairs and Corpora
Vikram Ramanarayanan | Robert Pugh
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We examine the efficacy of various feature–learner combinations for language identification in different types of text-based code-switched interactions – human-human dialog, human-machine dialog as well as monolog – at both the token and turn levels. In order to examine the generalization of such methods across language pairs and datasets, we analyze 10 different datasets of code-switched text. We extract a variety of character- and word-based text features and pass them into multiple learners, including conditional random fields, logistic regressors and recurrent neural networks. We further examine the efficacy of novel character-level embedding and GloVe features in improving performance and observe that our best-performing text system significantly outperforms a majority vote baseline across language pairs and datasets.

Leveraging Multimodal Dialog Technology for the Design of Automated and Interactive Student Agents for Teacher Training
David Pautler | Vikram Ramanarayanan | Kirby Cofino | Patrick Lange | David Suendermann-Oeft
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue

We present a paradigm for interactive teacher training that leverages multimodal dialog technology to puppeteer custom-designed embodied conversational agents (ECAs) in student roles. We used the open-source multimodal dialog system HALEF to implement a small-group classroom math discussion involving Venn diagrams where a human teacher candidate has to interact with two student ECAs whose actions are controlled by the dialog system. Such an automated paradigm has the potential to be extended and scaled to a wide range of interactive simulation scenarios in education, medicine, and business where group interaction training is essential.


Automated Speech Recognition Technology for Dialogue Interaction with Non-Native Interlocutors
Alexei V. Ivanov | Vikram Ramanarayanan | David Suendermann-Oeft | Melissa Lopez | Keelan Evanini | Jidong Tao
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

A distributed cloud-based dialog system for conversational application development
Vikram Ramanarayanan | David Suendermann-Oeft | Alexei V. Ivanov | Keelan Evanini
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue