Gautam Shroff


2021

pdf bib
Complex Question Answering on knowledge graphs using machine translation and multi-task learning
Saurabh Srivastava | Mayur Patidar | Sudip Chowdhury | Puneet Agarwal | Indrajit Bhattacharya | Gautam Shroff
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Question answering (QA) over a knowledge graph (KG) is a task of answering a natural language (NL) query using the information stored in KG. In a real-world industrial setting, this involves addressing multiple challenges including entity linking, multi-hop reasoning over KG, etc. Traditional approaches handle these challenges in a modularized sequential manner where errors in one module lead to the accumulation of errors in downstream modules. Often these challenges are inter-related and the solutions to them can reinforce each other when handled simultaneously in an end-to-end learning setup. To this end, we propose a multi-task BERT based Neural Machine Translation (NMT) model to address these challenges. Through experimental analysis, we demonstrate the efficacy of our proposed approach on one publicly available and one proprietary dataset.

2019

pdf bib
From Monolingual to Multilingual FAQ Assistant using Multilingual Co-training
Mayur Patidar | Surabhi Kumari | Manasi Patwardhan | Shirish Karande | Puneet Agarwal | Lovekesh Vig | Gautam Shroff
Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019)

Recent research on cross-lingual transfer show state-of-the-art results on benchmark datasets using pre-trained language representation models (PLRM) like BERT. These results are achieved with the traditional training approaches, such as Zero-shot with no data, Translate-train or Translate-test with machine translated data. In this work, we propose an approach of “Multilingual Co-training” (MCT) where we augment the expert annotated dataset in the source language (English) with the corresponding machine translations in the target languages (e.g. Arabic, Spanish) and fine-tune the PLRM jointly. We observe that the proposed approach provides consistent gains in the performance of BERT for multiple benchmark datasets (e.g. 1.0% gain on MLDocs, and 1.2% gain on XNLI over translate-train with BERT), while requiring a single model for multiple languages. We further consider a FAQ dataset where the available English test dataset is translated by experts into Arabic and Spanish. On such a dataset, we observe an average gain of 4.9% over all other cross-lingual transfer protocols with BERT. We further observe that domain-specific joint pre-training of the PLRM using HR policy documents in English along with the machine translations in the target languages, followed by the joint finetuning, provides a further improvement of 2.8% in average accuracy.

2017

pdf bib
Information Bottleneck Inspired Method For Chat Text Segmentation
S Vishal | Mohit Yadav | Lovekesh Vig | Gautam Shroff
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

We present a novel technique for segmenting chat conversations using the information bottleneck method (Tishby et al., 2000), augmented with sequential continuity constraints. Furthermore, we utilize critical non-textual clues such as time between two consecutive posts and people mentions within the posts. To ascertain the effectiveness of the proposed method, we have collected data from public Slack conversations and Fresco, a proprietary platform deployed inside our organization. Experiments demonstrate that the proposed method yields an absolute (relative) improvement of as high as 3.23% (11.25%). To facilitate future research, we are releasing manual annotations for segmentation on public Slack conversations.

pdf bib
Learning and Knowledge Transfer with Memory Networks for Machine Comprehension
Mohit Yadav | Lovekesh Vig | Gautam Shroff
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Enabling machines to read and comprehend unstructured text remains an unfulfilled goal for NLP research. Recent research efforts on the “machine comprehension” task have managed to achieve close to ideal performance on simulated data. However, achieving similar levels of performance on small real world datasets has proved difficult; major challenges stem from the large vocabulary size, complex grammar, and, the frequent ambiguities in linguistic structure. On the other hand, the requirement of human generated annotations for training, in order to ensure a sufficiently diverse set of questions is prohibitively expensive. Motivated by these practical issues, we propose a novel curriculum inspired training procedure for Memory Networks to improve the performance for machine comprehension with relatively small volumes of training data. Additionally, we explore various training regimes for Memory Networks to allow knowledge transfer from a closely related domain having larger volumes of labelled data. We also suggest the use of a loss function to incorporate the asymmetric nature of knowledge transfer. Our experiments demonstrate improvements on Dailymail, CNN, and MCTest datasets.