Yash Kumar Lal


2022

pdf
SBU Figures It Out: Models Explain Figurative Language
Yash Kumar Lal | Mohaddeseh Bastan
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)

Figurative language is ubiquitous in human communication. However, current NLP models are unable to demonstrate a significant understanding of instances of this phenomena. The EMNLP 2022 shared task on figurative language understanding posed the problem of predicting and explaining the relation between a premise and a hypothesis containing an instance of the use of figurative language. We experiment with different variations of using T5-large for this task and build a model that significantly outperforms the task baseline. Treating it as a new task for T5 and simply finetuning on the data achieves the best score on the defined evaluation. Furthermore, we find that hypothesis-only models are able to achieve most of the performance.

pdf
Using Commonsense Knowledge to Answer Why-Questions
Yash Kumar Lal | Niket Tandon | Tanvi Aggarwal | Horace Liu | Nathanael Chambers | Raymond Mooney | Niranjan Balasubramanian
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Answering questions in narratives about why events happened often requires commonsense knowledge external to the text. What aspects of this knowledge are available in large language models? What aspects can be made accessible via external commonsense resources? We study these questions in the context of answering questions in the TellMeWhy dataset using COMET as a source of relevant commonsense relations. We analyze the effects of model size (T5 and GPT3) along with methods of injecting knowledge (COMET) into these models. Results show that the largest models, as expected, yield substantial improvements over base models. Injecting external knowledge helps models of various sizes, but the amount of improvement decreases with larger model size. We also find that the format in which knowledge is provided is critical, and that smaller models benefit more from larger amounts of knowledge. Finally, we develop an ontology of knowledge types and analyze the relative coverage of the models across these categories.

pdf bib
Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)
Antoine Bosselut | Xiang Li | Bill Yuchen Lin | Vered Shwartz | Bodhisattwa Prasad Majumder | Yash Kumar Lal | Rachel Rudinger | Xiang Ren | Niket Tandon | Vilém Zouhar
Proceedings of the First Workshop on Commonsense Representation and Reasoning (CSRR 2022)

2021

pdf
IrEne-viz: Visualizing Energy Consumption of Transformer Models
Yash Kumar Lal | Reetu Singh | Harsh Trivedi | Qingqing Cao | Aruna Balasubramanian | Niranjan Balasubramanian
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

IrEne is an energy prediction system that accurately predicts the interpretable inference energy consumption of a wide range of Transformer-based NLP models. We present the IrEne-viz tool, an online platform for visualizing and exploring energy consumption of various Transformer-based models easily. Additionally, we release a public API that can be used to access granular information about energy consumption of transformer models and their components. The live demo is available at http://stonybrooknlp.github.io/irene/demo/.

pdf
TellMeWhy: A Dataset for Answering Why-Questions in Narratives
Yash Kumar Lal | Nathanael Chambers | Raymond Mooney | Niranjan Balasubramanian
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
IrEne: Interpretable Energy Prediction for Transformers
Qingqing Cao | Yash Kumar Lal | Harsh Trivedi | Aruna Balasubramanian | Niranjan Balasubramanian
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Existing software-based energy measurements of NLP models are not accurate because they do not consider the complex interactions between energy consumption and model execution. We present IrEne, an interpretable and extensible energy prediction system that accurately predicts the inference energy consumption of a wide range of Transformer-based NLP models. IrEne constructs a model tree graph that breaks down the NLP model into modules that are further broken down into low-level machine learning (ML) primitives. IrEne predicts the inference energy consumption of the ML primitives as a function of generalizable features and fine-grained runtime resource usage. IrEne then aggregates these low-level predictions recursively to predict the energy of each module and finally of the entire model. Experiments across multiple Transformer models show IrEne predicts inference energy consumption of transformer models with an error of under 7% compared to the ground truth. In contrast, existing energy models see an error of over 50%. We also show how IrEne can be used to conduct energy bottleneck analysis and to easily evaluate the energy impact of different architectural choices. We release the code and data at https://github.com/StonyBrookNLP/irene.

2020

pdf
Temporal Reasoning in Natural Language Inference
Siddharth Vashishtha | Adam Poliak | Yash Kumar Lal | Benjamin Van Durme | Aaron Steven White
Findings of the Association for Computational Linguistics: EMNLP 2020

We introduce five new natural language inference (NLI) datasets focused on temporal reasoning. We recast four existing datasets annotated for event duration—how long an event lasts—and event ordering—how events are temporally arranged—into more than one million NLI examples. We use these datasets to investigate how well neural models trained on a popular NLI corpus capture these forms of temporal reasoning.

2019

pdf
Johns Hopkins University Submission for WMT News Translation Task
Kelly Marchisio | Yash Kumar Lal | Philipp Koehn
Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

We describe the work of Johns Hopkins University for the shared task of news translation organized by the Fourth Conference on Machine Translation (2019). We submitted systems for both directions of the English-German language pair. The systems combine multiple techniques – sampling, filtering, iterative backtranslation, and continued training – previously used to improve performance of neural machine translation models. At submission time, we achieve a BLEU score of 38.1 for De-En and 42.5 for En-De translation directions on newstest2019. Post-submission, the score is 38.4 for De-En and 42.8 for En-De. Various experiments conducted in the process are also described.

pdf
Sentence-Level Adaptation for Low-Resource Neural Machine Translation
Aaron Mueller | Yash Kumar Lal
Proceedings of the 2nd Workshop on Technologies for MT of Low Resource Languages

pdf
De-Mixing Sentiment from Code-Mixed Text
Yash Kumar Lal | Vaibhav Kumar | Mrinal Dhar | Manish Shrivastava | Philipp Koehn
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

Code-mixing is the phenomenon of mixing the vocabulary and syntax of multiple languages in the same sentence. It is an increasingly common occurrence in today’s multilingual society and poses a big challenge when encountered in different downstream tasks. In this paper, we present a hybrid architecture for the task of Sentiment Analysis of English-Hindi code-mixed data. Our method consists of three components, each seeking to alleviate different issues. We first generate subword level representations for the sentences using a CNN architecture. The generated representations are used as inputs to a Dual Encoder Network which consists of two different BiLSTMs - the Collective and Specific Encoder. The Collective Encoder captures the overall sentiment of the sentence, while the Specific Encoder utilizes an attention mechanism in order to focus on individual sentiment-bearing sub-words. This, combined with a Feature Network consisting of orthographic features and specially trained word embeddings, achieves state-of-the-art results - 83.54% accuracy and 0.827 F1 score - on a benchmark dataset.