Siddhartha Jonnalagadda


2022

pdf
Massive-scale Decoding for Text Generation using Lattices
Jiacheng Xu | Siddhartha Jonnalagadda | Greg Durrett
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Conditional neural text generation models generate high-quality outputs, but often concentrate around a mode when what we really want is a diverse set of options. We present a search algorithm to construct lattices encoding a massive number of generation options. First, we restructure decoding as a best-first search, which explores the space differently than beam search and improves efficiency by avoiding pruning paths. Second, we revisit the idea of hypothesis recombination: we can identify pairs of similar generation candidates during search and merge them as an approximation. On both summarization and machine translation, we show that our algorithm encodes thousands of diverse options that remain grammatical and high-quality into one lattice. This algorithm provides a foundation for building downstream generation applications on top of massive-scale diverse outputs.

pdf
Logical Reasoning for Task Oriented Dialogue Systems
Sajjad Beygi | Maryam Fazel-Zarandi | Alessandra Cervone | Prakash Krishnan | Siddhartha Jonnalagadda
Proceedings of the Fifth Workshop on e-Commerce and NLP (ECNLP 5)

In recent years, large pretrained models have been used in dialogue systems to improve successful task completion rates. However, lack of reasoning capabilities of dialogue platforms make it difficult to provide relevant and fluent responses, unless the designers of a conversational experience spend a considerable amount of time implementing these capabilities in external rule based modules. In this work, we propose a novel method to fine-tune pretrained transformer models such as Roberta and T5, to reason over a set of facts in a given dialogue context. Our method includes a synthetic data generation mechanism which helps the model learn logical relations, such as comparison between list of numerical values, inverse relations (and negation), inclusion and exclusion for categorical attributes, and application of a combination of attributes over both numerical and categorical values, and spoken form for numerical values, without need for additional training data. We show that the transformer based model can perform logical reasoning to answer questions when the dialogue context contains all the required information, otherwise it is able to extract appropriate constraints to pass to downstream components (e.g. a knowledge base) when partial information is available. We observe that transformer based models such as UnifiedQA-T5 can be fine-tuned to perform logical reasoning (such as numerical and categorical attributes’ comparison) over attributes seen at training time (e.g., accuracy of 90%+ for comparison of smaller than kmax=5 values over heldout test dataset).

2013

pdf
Evaluating the Use of Empirically Constructed Lexical Resources for Named Entity Recognition
Siddhartha Jonnalagadda | Trevor Cohen | Stephen Wu | Hongfang Liu | Graciela Gonzalez
Proceedings of the IWCS 2013 Workshop on Computational Semantics in Clinical Text (CSCT 2013)

pdf
Analysis of Cross-Institutional Medication Information Annotations in Clinical Notes
Sunghwan Sohn | Cheryl Clark | Scott Halgrim | Sean Murphy | Siddhartha Jonnalagadda | Kavishwar Wagholikar | Stephen Wu | Christopher Chute | Hongfang Liu
Proceedings of the IWCS 2013 Workshop on Computational Semantics in Clinical Text (CSCT 2013)

2009

pdf
Towards Effective Sentence Simplification for Automatic Processing of Biomedical Text
Siddhartha Jonnalagadda | Luis Tari | Jörg Hakenberg | Chitta Baral | Graciela Gonzalez
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers