2021
pdf
bib
abs
Explanations for CommonsenseQA: New Dataset and Models
Shourya Aggarwal
|
Divyanshu Mandowara
|
Vishwajeet Agrawal
|
Dinesh Khandelwal
|
Parag Singla
|
Dinesh Garg
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
CommonsenseQA (CQA) (Talmor et al., 2019) dataset was recently released to advance the research on common-sense question answering (QA) task. Whereas the prior work has mostly focused on proposing QA models for this dataset, our aim is to retrieve as well as generate explanation for a given (question, correct answer choice, incorrect answer choices) tuple from this dataset. Our explanation definition is based on certain desiderata, and translates an explanation into a set of positive and negative common-sense properties (aka facts) which not only explain the correct answer choice but also refute the incorrect ones. We human-annotate a first-of-its-kind dataset (called ECQA) of positive and negative properties, as well as free-flow explanations, for 11K QA pairs taken from the CQA dataset. We propose a latent representation based property retrieval model as well as a GPT-2 based property generation model with a novel two step fine-tuning procedure. We also propose a free-flow explanation generation model. Extensive experiments show that our retrieval model beats BM25 baseline by a relative gain of 100% in F1 score, property generation model achieves a respectable F1 score of 36.4, and free-flow generation model achieves a similarity score of 61.9, where last two scores are based on a human correlated semantic similarity metric.
2020
pdf
bib
abs
Transfer Learning for Related Languages: Submissions to the WMT20 Similar Language Translation Task
Lovish Madaan
|
Soumya Sharma
|
Parag Singla
Proceedings of the Fifth Conference on Machine Translation
In this paper, we describe IIT Delhi’s submissions to the WMT 2020 task on Similar Language Translation for four language directions: Hindi <-> Marathi and Spanish <-> Portuguese. We try out three different model settings for the translation task and select our primary and contrastive submissions on the basis of performance of these three models. For our best submissions, we fine-tune the mBART model on the parallel data provided for the task. The pre-training is done using self-supervised objectives on a large amount of monolingual data for many languages. Overall, our models are ranked in the top four of all systems for the submitted language pairs, with first rank in Spanish -> Portuguese.
2016
pdf
bib
Entity-balanced Gaussian pLSA for Automated Comparison
Danish Contractor
|
Parag Singla
|
Mausam
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies