Vinay Namboodiri


More Parameters? No Thanks!
Zeeshan Khan | Kartheek Akella | Vinay Namboodiri | C V Jawahar
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021


PhraseOut: A Code Mixed Data Augmentation Method for MultilingualNeural Machine Tranlsation
Binu Jasim | Vinay Namboodiri | C V Jawahar
Proceedings of the 17th International Conference on Natural Language Processing (ICON)

Data Augmentation methods for Neural Machine Translation (NMT) such as back- translation (BT) and self-training (ST) are quite popular. In a multilingual NMT system, simply copying monolingual source sentences to the target (Copying) is an effective data augmentation method. Back-translation aug- ments parallel data by translating monolingual sentences in the target side to source language. In this work we propose to use a partial back- translation method in a multilingual setting. Instead of translating the entire monolingual target sentence back into the source language, we replace selected high confidence phrases only and keep the rest of the words in the target language itself. (We call this method PhraseOut). Our experiments on low resource multilingual translation models show that PhraseOut gives reasonable improvements over the existing data augmentation methods.


CVIT’s submissions to WAT-2019
Jerin Philip | Shashank Siripragada | Upendra Kumar | Vinay Namboodiri | C V Jawahar
Proceedings of the 6th Workshop on Asian Translation

This paper describes the Neural Machine Translation systems used by IIIT Hyderabad (CVIT-MT) for the translation tasks part of WAT-2019. We participated in tasks pertaining to Indian languages and submitted results for English-Hindi, Hindi-English, English-Tamil and Tamil-English language pairs. We employ Transformer architecture experimenting with multilingual models and methods for low-resource languages.


Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator
Badri Narayana Patro | Vinod Kumar Kurmi | Sandeep Kumar | Vinay Namboodiri
Proceedings of the 27th International Conference on Computational Linguistics

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.

Multimodal Differential Network for Visual Question Generation
Badri Narayana Patro | Sandeep Kumar | Vinod Kumar Kurmi | Vinay Namboodiri
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations. Images can have multiple visual and language contexts that are relevant for generating questions namely places, captions, and tags. In this paper, we propose the use of exemplars for obtaining the relevant context. We obtain this by using a Multimodal Differential Network to produce natural and engaging questions. The generated questions show a remarkable similarity to the natural questions as validated by a human study. Further, we observe that the proposed approach substantially improves over state-of-the-art benchmarks on the quantitative metrics (BLEU, METEOR, ROUGE, and CIDEr).