Akhila Yerukola


2022

pdf
Explainable Slot Type Attentions to Improve Joint Intent Detection and Slot Filling
Kalpa Gunaratna | Vijay Srinivasan | Akhila Yerukola | Hongxia Jin
Findings of the Association for Computational Linguistics: EMNLP 2022

Joint intent detection and slot filling is a key research topic in natural language understanding (NLU). Existing joint intent and slot filling systems analyze and compute features collectively for all slot types, and importantly, have no way to explain the slot filling model decisions. In this work, we propose a novel approach that: (i) learns to generate additional slot type specific features in order to improve accuracy and (ii) provides explanations for slot filling decisions for the first time in a joint NLU model. We perform an additional constrained supervision using a set of binary classifiers for the slot type specific feature learning, thus ensuring appropriate attention weights are learned in the process to explain slot filling decisions for utterances. Our model is inherently explainable and does not need any post-hoc processing. We evaluate our approach on two widely used datasets and show accuracy improvements. Moreover, a detailed analysis is also provided for the exclusive slot explainability.

2021

pdf
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Sebastian Gehrmann | Tosin Adewumi | Karmanya Aggarwal | Pawan Sasanka Ammanamanchi | Anuoluwapo Aremu | Antoine Bosselut | Khyathi Raghavi Chandu | Miruna-Adriana Clinciu | Dipanjan Das | Kaustubh Dhole | Wanyu Du | Esin Durmus | Ondřej Dušek | Chris Chinenye Emezue | Varun Gangal | Cristina Garbacea | Tatsunori Hashimoto | Yufang Hou | Yacine Jernite | Harsh Jhamtani | Yangfeng Ji | Shailza Jolly | Mihir Kale | Dhruv Kumar | Faisal Ladhak | Aman Madaan | Mounica Maddela | Khyati Mahajan | Saad Mahamood | Bodhisattwa Prasad Majumder | Pedro Henrique Martins | Angelina McMillan-Major | Simon Mille | Emiel van Miltenburg | Moin Nadeem | Shashi Narayan | Vitaly Nikolaev | Andre Niyongabo Rubungo | Salomey Osei | Ankur Parikh | Laura Perez-Beltrachini | Niranjan Ramesh Rao | Vikas Raunak | Juan Diego Rodriguez | Sashank Santhanam | João Sedoc | Thibault Sellam | Samira Shaikh | Anastasia Shimorina | Marco Antonio Sobrevilla Cabezudo | Hendrik Strobelt | Nishant Subramani | Wei Xu | Diyi Yang | Akhila Yerukola | Jiawei Zhou
Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021)

We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it challenging to identify the limitations of current models and opportunities for progress. Addressing this limitation, GEM provides an environment in which models can easily be applied to a wide set of tasks and in which evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models. This paper serves as the description of the data for the 2021 shared task at the associated GEM Workshop.

pdf
Data Augmentation for Voice-Assistant NLU using BERT-based Interchangeable Rephrase
Akhila Yerukola | Mason Bretan | Hongxia Jin
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

We introduce a data augmentation technique based on byte pair encoding and a BERT-like self-attention model to boost performance on spoken language understanding tasks. We compare and evaluate this method with a range of augmentation techniques encompassing generative models such as VAEs and performance-boosting techniques such as synonym replacement and back-translation. We show our method performs strongly on domain and intent classification tasks for a voice assistant and in a user-study focused on utterance naturalness and semantic similarity.

2019

pdf
Do Massively Pretrained Language Models Make Better Storytellers?
Abigail See | Aneesh Pappu | Rohun Saxena | Akhila Yerukola | Christopher D. Manning
Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

Large neural language models trained on massive amounts of text have emerged as a formidable strategy for Natural Language Understanding tasks. However, the strength of these models as Natural Language Generators is less clear. Though anecdotal evidence suggests that these models generate better quality text, there has been no detailed study characterizing their generation abilities. In this work, we compare the performance of an extensively pretrained model, OpenAI GPT2-117 (Radford et al., 2019), to a state-of-the-art neural story generation model (Fan et al., 2018). By evaluating the generated text across a wide variety of automatic metrics, we characterize the ways in which pretrained models do, and do not, make better storytellers. We find that although GPT2-117 conditions more strongly on context, is more sensitive to ordering of events, and uses more unusual words, it is just as likely to produce repetitive and under-diverse text when using likelihood-maximizing decoding algorithms.