Subhashini Venugopalan


2023

pdf bib
Clinical BERTScore: An Improved Measure of Automatic Speech Recognition Performance in Clinical Settings
Joel Shor | Ruyue Agnes Bi | Subhashini Venugopalan | Steven Ibara | Roman Goldenberg | Ehud Rivlin
Proceedings of the 5th Clinical Natural Language Processing Workshop

Automatic Speech Recognition (ASR) in medical contexts has the potential to save time, cut costs, increase report accuracy, and reduce physician burnout. However, the healthcare industry has been slower to adopt this technology, in part due to the importance of avoiding medically-relevant transcription mistakes. In this work, we present the Clinical BERTScore (CBERTScore), an ASR metric that penalizes clinically-relevant mistakes more than others. We collect a benchmark of 18 clinician preferences on 149 realistic medical sentences called the Clinician Transcript Preference benchmark (CTP) and make it publicly available for the community to further develop clinically-aware ASR metrics. To our knowledge, this is the first public dataset of its kind. We demonstrate that our metric more closely aligns with clinician preferences on medical sentences as compared to other metrics (WER, BLUE, METEOR, etc), sometimes by wide margins.

2022

pdf
Context-Aware Abbreviation Expansion Using Large Language Models
Shanqing Cai | Subhashini Venugopalan | Katrin Tomanek | Ajit Narayanan | Meredith Morris | Michael Brenner
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Motivated by the need for accelerating text entry in augmentative and alternative communication (AAC) for people with severe motor impairments, we propose a paradigm in which phrases are abbreviated aggressively as primarily word-initial letters. Our approach is to expand the abbreviations into full-phrase options by leveraging conversation context with the power of pretrained large language models (LLMs). Through zero-shot, few-shot, and fine-tuning experiments on four public conversation datasets, we show that for replies to the initial turn of a dialog, an LLM with 64B parameters is able to exactly expand over 70% of phrases with abbreviation length up to 10, leading to an effective keystroke saving rate of up to about 77% on these exact expansions. Including a small amount of context in the form of a single conversation turn more than doubles abbreviation expansion accuracies compared to having no context, an effect that is more pronounced for longer phrases. Additionally, the robustness of models against typo noise can be enhanced through fine-tuning on noisy data.

2016

pdf
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
Subhashini Venugopalan | Lisa Anne Hendricks | Raymond Mooney | Kate Saenko
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf
Translating Videos to Natural Language Using Deep Recurrent Neural Networks
Subhashini Venugopalan | Huijuan Xu | Jeff Donahue | Marcus Rohrbach | Raymond Mooney | Kate Saenko
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
Integrating Language and Vision to Generate Natural Language Descriptions of Videos in the Wild
Jesse Thomason | Subhashini Venugopalan | Sergio Guadarrama | Kate Saenko | Raymond Mooney
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers