V.G.Vinod Vydiswaran


2023

pdf
LHS712EE at BioLaySumm 2023: Using BART and LED to summarize biomedical research articles
Quancheng Liu | Xiheng Ren | V.G.Vinod Vydiswaran
The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

As part of our participation in BioLaySumm 2023, we explored the use of large language models (LLMs) to automatically generate concise and readable summaries of biomedical research articles. We utilized pre-trained LLMs to fine-tune our summarization models on two provided datasets, and adapt them to the shared task within the constraints of training time and computational power. Our final models achieved very high relevance and factuality scores on the test set, and ranked among the top five models in the overall performance.

pdf
Defending against Insertion-based Textual Backdoor Attacks via Attribution
Jiazhao Li | Zhuofeng Wu | Wei Ping | Chaowei Xiao | V.G.Vinod Vydiswaran
Findings of the Association for Computational Linguistics: ACL 2023

Textual backdoor attack, as a novel attack model, has been shown to be effective in adding a backdoor to the model during training. Defending against such backdoor attacks has become urgent and important. In this paper, we propose AttDef, an efficient attribution-based pipeline to defend against two insertion-based poisoning attacks, BadNL and InSent. Specifically, we regard the tokens with larger attribution scores as potential triggers since larger attribution words contribute more to the false prediction results and therefore are more likely to be poison triggers. Additionally, we further utilize an external pre-trained language model to distinguish whether input is poisoned or not. We show that our proposed method can generalize sufficiently well in two common attack scenarios (poisoning training data and testing data), which consistently improves previous methods. For instance, AttDef can successfully mitigate both attacks with an average accuracy of 79.97% (56.59% up) and 48.34% (3.99% up) under pre-training and post-training attack defense respectively, achieving the new state-of-the-art performance on prediction recovery over four benchmark datasets.

2022

pdf
IDPG: An Instance-Dependent Prompt Generation Method
Zhuofeng Wu | Sinong Wang | Jiatao Gu | Rui Hou | Yuxiao Dong | V.G.Vinod Vydiswaran | Hao Ma
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Prompt tuning is a new, efficient NLP transfer learning paradigm that adds a task-specific prompt in each input instance during the model training stage. It freezes the pre-trained language model and only optimizes a few task-specific prompts. In this paper, we propose a conditional prompt generation method to generate prompts for each input instance, referred to as the Instance-Dependent Prompt Generation (IDPG). Unlike traditional prompt tuning methods that use a fixed prompt, IDPG introduces a lightweight and trainable component to generate prompts based on each input sentence. Extensive experiments on ten natural language understanding (NLU) tasks show that the proposed strategy consistently outperforms various prompt tuning baselines and is on par with other efficient transfer learning methods such as Compacter while tuning far fewer model parameters.

2020

pdf
Identifying Medication Abuse and Adverse Effects from Tweets: University of Michigan at #SMM4H 2020
V.G.Vinod Vydiswaran | Deahan Yu | Xinyan Zhao | Ermioni Carr | Jonathan Martindale | Jingcheng Xiao | Noha Ghannam | Matteo Althoen | Alexis Castellanos | Neel Patel | Daniel Vasquez
Proceedings of the Fifth Social Media Mining for Health Applications Workshop & Shared Task

The team from the University of Michigan participated in three tasks in the Social Media Mining for Health Applications (#SMM4H) 2020 shared tasks – on detecting mentions of adverse effects (Task 2), extracting and normalizing them (Task 3), and detecting mentions of medication abuse (Task 4). Our approaches relied on a combination of traditional machine learning and deep learning models. On Tasks 2 and 4, our submitted runs performed at or above the task average.

pdf
PharmMT: A Neural Machine Translation Approach to Simplify Prescription Directions
Jiazhao Li | Corey Lester | Xinyan Zhao | Yuting Ding | Yun Jiang | V.G.Vinod Vydiswaran
Findings of the Association for Computational Linguistics: EMNLP 2020

The language used by physicians and health professionals in prescription directions includes medical jargon and implicit directives and causes much confusion among patients. Human intervention to simplify the language at the pharmacies may introduce additional errors that can lead to potentially severe health outcomes. We propose a novel machine translation-based approach, PharmMT, to automatically and reliably simplify prescription directions into patient-friendly language, thereby significantly reducing pharmacist workload. We evaluate the proposed approach over a dataset consisting of over 530K prescriptions obtained from a large mail-order pharmacy. The end-to-end system achieves a BLEU score of 60.27 against the reference directions generated by pharmacists, a 39.6% relative improvement over the rule-based normalization. Pharmacists judged 94.3% of the simplified directions as usable as-is or with minimal changes. This work demonstrates the feasibility of a machine translation-based tool for simplifying prescription directions in real-life.

2019

pdf
Identifying Adverse Drug Events Mentions in Tweets Using Attentive, Collocated, and Aggregated Medical Representation
Xinyan Zhao | Deahan Yu | V.G.Vinod Vydiswaran
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

Identifying mentions of medical concepts in social media is challenging because of high variability in free text. In this paper, we propose a novel neural network architecture, the Collocated LSTM with Attentive Pooling and Aggregated representation (CLAPA), that integrates a bidirectional LSTM model with attention and pooling strategy and utilizes the collocation information from training data to improve the representation of medical concepts. The collocation and aggregation layers improve the model performance on the task of identifying mentions of adverse drug events (ADE) in tweets. Using the dataset made available as part of the workshop shared task, we show that careful selection of neighborhood contexts can help uncover useful local information and improve the overall medical concept representation.

pdf
Towards Text Processing Pipelines to Identify Adverse Drug Events-related Tweets: University of Michigan @ SMM4H 2019 Task 1
V.G.Vinod Vydiswaran | Grace Ganzel | Bryan Romas | Deahan Yu | Amy Austin | Neha Bhomia | Socheatha Chan | Stephanie Hall | Van Le | Aaron Miller | Olawunmi Oduyebo | Aulia Song | Radhika Sondhi | Danny Teng | Hao Tseng | Kim Vuong | Stephanie Zimmerman
Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task

We participated in Task 1 of the Social Media Mining for Health Applications (SMM4H) 2019 Shared Tasks on detecting mentions of adverse drug events (ADEs) in tweets. Our approach relied on a text processing pipeline for tweets, and training traditional machine learning and deep learning models. Our submitted runs performed above average for the task.

2017

pdf
Identifying Usage Expression Sentences in Consumer Product Reviews
Shibamouli Lahiri | V.G.Vinod Vydiswaran | Rada Mihalcea
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In this paper we introduce the problem of identifying usage expression sentences in a consumer product review. We create a human-annotated gold standard dataset of 565 reviews spanning five distinct product categories. Our dataset consists of more than 3,000 annotated sentences. We further introduce a classification system to label sentences according to whether or not they describe some “usage”. The system combines lexical, syntactic, and semantic features in a product-agnostic fashion to yield good classification performance. We show the effectiveness of our approach using importance ranking of features, error analysis, and cross-product classification experiments.

2016

pdf bib
Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing
Tanmoy Chakraborty | Martin Riedl | V.G.Vinod Vydiswaran
Proceedings of TextGraphs-10: the Workshop on Graph-based Methods for Natural Language Processing

2014

pdf bib
Proceedings of TextGraphs-9: the workshop on Graph-based Methods for Natural Language Processing
V.G.Vinod Vydiswaran | Amarnag Subramanya | Gabor Melli | Irina Matveeva
Proceedings of TextGraphs-9: the workshop on Graph-based Methods for Natural Language Processing

2010

pdf
Textual Entailment
Mark Sammons | Idan Szpektor | V.G.Vinod Vydiswaran
NAACL HLT 2010 Tutorial Abstracts

pdf
“Ask Not What Textual Entailment Can Do for You...”
Mark Sammons | V.G.Vinod Vydiswaran | Dan Roth
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

2009

pdf
A Framework for Entailed Relation Recognition
Dan Roth | Mark Sammons | V.G.Vinod Vydiswaran
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers