Shirish Shevade


2022

pdf
Efficient Constituency Tree based Encoding for Natural Language to Bash Translation
Shikhar Bharadwaj | Shirish Shevade
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Bash is a Unix command language used for interacting with the Operating System. Recent works on natural language to Bash translation have made significant advances, but none of the previous methods utilize the problem’s inherent structure. We identify this structure andpropose a Segmented Invocation Transformer (SIT) that utilizes the information from the constituency parse tree of the natural language text. Our method is motivated by the alignment between segments in the natural language text and Bash command components. Incorporating the structure in the modelling improves the performance of the model. Since such systems must be universally accessible, we benchmark the inference times on a CPU rather than a GPU. We observe a 1.8x improvement in the inference time and a 5x reduction in model parameters. Attribution analysis using Integrated Gradients reveals that the proposed method can capture the problem structure.

2021

pdf
Explainable Natural Language to Bash Translation using Abstract Syntax Tree
Shikhar Bharadwaj | Shirish Shevade
Proceedings of the 25th Conference on Computational Natural Language Learning

Natural language processing for program synthesis has been widely researched. In this work, we focus on generating Bash commands from natural language invocations with explanations. We propose a novel transformer based solution by utilizing Bash Abstract Syntax Trees and manual pages. Our method incorporates tree structure information in the transformer architecture and provides explanations for its predictions via alignment matrices between user invocation and manual page text. Our method performs on par with the state of the art performance on Natural Language Context to Command task and performs better than fine-tuned T5 and Seq2Seq models.

2017

pdf
Latent Space Embedding for Retrieval in Question-Answer Archives
Deepak P | Dinesh Garg | Shirish Shevade
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Community-driven Question Answering (CQA) systems such as Yahoo! Answers have become valuable sources of reusable information. CQA retrieval enables usage of historical CQA archives to solve new questions posed by users. This task has received much recent attention, with methods building upon literature from translation models, topic models, and deep learning. In this paper, we devise a CQA retrieval technique, LASER-QA, that embeds question-answer pairs within a unified latent space preserving the local neighborhood structure of question and answer spaces. The idea is that such a space mirrors semantic similarity among questions as well as answers, thereby enabling high quality retrieval. Through an empirical analysis on various real-world QA datasets, we illustrate the improved effectiveness of LASER-QA over state-of-the-art methods.

2012

pdf
Extension of TSVM to Multi-Class and Hierarchical Text Classification Problems With General Losses
Sathiya Keerthi Selvaraj | Sundararajan Sellamanickam | Shirish Shevade
Proceedings of COLING 2012: Posters