Abhijeet Awasthi


Bootstrapping Multilingual Semantic Parsers using Large Language Models
Abhijeet Awasthi | Nitish Gupta | Bidisha Samanta | Shachi Dave | Sunita Sarawagi | Partha Talukdar
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Despite cross-lingual generalization demonstrated by pre-trained multilingual models, the translate-train paradigm of transferring English datasets across multiple languages remains to be a key mechanism for training task-specific multilingual models. However, for many low-resource languages, the availability of a reliable translation service entails significant amounts of costly human-annotated translation pairs. Further, translation services may continue to be brittle due to domain mismatch between task-specific input text and general-purpose text used for training translation models. For multilingual semantic parsing, we demonstrate the effectiveness and flexibility offered by large language models (LLMs) for translating English datasets into several languages via few-shot prompting. Through extensive comparisons on two public datasets, MTOP and MASSIVE, spanning 50 languages and several domains, we show that our method of translating data using LLMs outperforms a strong translate-train baseline on 41 out of 50 languages. We study the key design choices that enable more effective multilingual data translation via prompted LLMs.


Diverse Parallel Data Synthesis for Cross-Database Adaptation of Text-to-SQL Parsers
Abhijeet Awasthi | Ashutosh Sathe | Sunita Sarawagi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Text-to-SQL parsers typically struggle with databases unseen during the train time. Adapting Text-to-SQL parsers to new database schemas is a challenging problem owing to a vast diversity of schemas and zero availability of natural language queries in new schemas. We present ReFill, a framework for synthesizing high-quality and textually diverse parallel datasets for adapting Text-to-SQL parsers. Unlike prior methods that utilize SQL-to-Text generation, ReFill learns to retrieve-and-edit text queries in existing schemas and transfer them to the new schema. ReFill utilizes a simple method for retrieving diverse existing text, masking their schema-specific tokens, and refilling with tokens relevant to the new schema. We show that this process leads to significantly more diverse text queries than achievable by standard SQL-to-Text generation models. Through experiments on several databases, we show that adapting a parser by finetuning it on datasets synthesized by ReFill consistently outperforms prior data-augmentation methods.


Exploiting Language Relatedness for Low Web-Resource Language Model Adaptation: An Indic Languages Study
Yash Khemchandani | Sarvesh Mehtani | Vaidehi Patil | Abhijeet Awasthi | Partha Talukdar | Sunita Sarawagi
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Recent research in multilingual language models (LM) has demonstrated their ability to effectively handle multiple languages in a single model. This holds promise for low web-resource languages (LRL) as multilingual models can enable transfer of supervision from high resource languages to LRLs. However, incorporating a new language in an LM still remains a challenge, particularly for languages with limited corpora and in unseen scripts. In this paper we argue that relatedness among languages in a language family may be exploited to overcome some of the corpora limitations of LRLs, and propose RelateLM. We focus on Indian languages, and exploit relatedness along two dimensions: (1) script (since many Indic scripts originated from the Brahmic script), and (2) sentence structure. RelateLM uses transliteration to convert the unseen script of limited LRL text into the script of a Related Prominent Language (RPL) (Hindi in our case). While exploiting similar sentence structures, RelateLM utilizes readily available bilingual dictionaries to pseudo translate RPL text into LRL corpora. Experiments on multiple real-world benchmark datasets provide validation to our hypothesis that using a related language as pivot, along with transliteration and pseudo translation based data augmentation, can be an effective way to adapt LMs for LRLs, rather than direct training or pivoting through English.


What’s in a Name? Are BERT Named Entity Representations just as Good for any other Name?
Sriram Balasubramanian | Naman Jain | Gaurav Jindal | Abhijeet Awasthi | Sunita Sarawagi
Proceedings of the 5th Workshop on Representation Learning for NLP

We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input. We highlight that on several tasks while such perturbations are natural, state of the art trained models are surprisingly brittle. The brittleness continues even with the recent entity-aware BERT models. We also try to discern the cause of this non-robustness, considering factors such as tokenization and frequency of occurrence. Then we provide a simple method that ensembles predictions from multiple replacements while jointly modeling the uncertainty of type annotations and label predictions. Experiments on three NLP tasks shows that our method enhances robustness and increases accuracy on both natural and adversarial datasets.


Parallel Iterative Edit Models for Local Sequence Transduction
Abhijeet Awasthi | Sunita Sarawagi | Rasna Goyal | Sabyasachi Ghosh | Vihari Piratla
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We present a Parallel Iterative Edit (PIE) model for the problem of local sequence transduction arising in tasks like Grammatical error correction (GEC). Recent approaches are based on the popular encoder-decoder (ED) model for sequence to sequence learning. The ED model auto-regressively captures full dependency among output tokens but is slow due to sequential decoding. The PIE model does parallel decoding, giving up the advantage of modeling full dependency in the output, yet it achieves accuracy competitive with the ED model for four reasons: 1. predicting edits instead of tokens, 2. labeling sequences instead of generating sequences, 3. iteratively refining predictions to capture dependencies, and 4. factorizing logits over edits and their token argument to harness pre-trained language models like BERT. Experiments on tasks spanning GEC, OCR correction and spell correction demonstrate that the PIE model is an accurate and significantly faster alternative for local sequence transduction.