Nishant Kambhatla


2021

pdf bib
Measuring and Improving Faithfulness of Attention in Neural Machine Translation
Pooya Moradi | Nishant Kambhatla | Anoop Sarkar
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

While the attention heatmaps produced by neural machine translation (NMT) models seem insightful, there is little evidence that they reflect a model’s true internal reasoning. We provide a measure of faithfulness for NMT based on a variety of stress tests where attention weights which are crucial for prediction are perturbed and the model should alter its predictions if the learned weights are a faithful explanation of the predictions. We show that our proposed faithfulness measure for NMT models can be improved using a novel differentiable objective that rewards faithful behaviour by the model through probability divergence. Our experimental results on multiple language pairs show that our objective function is effective in increasing faithfulness and can lead to a useful analysis of NMT model behaviour and more trustworthy attention heatmaps. Our proposed objective improves faithfulness without reducing the translation quality and has a useful regularization effect on the NMT model and can even improve translation quality in some cases.

2020

pdf bib
Training with Adversaries to Improve Faithfulness of Attention in Neural Machine Translation
Pooya Moradi | Nishant Kambhatla | Anoop Sarkar
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing: Student Research Workshop

Can we trust that the attention heatmaps produced by a neural machine translation (NMT) model reflect its true internal reasoning? We isolate and examine in detail the notion of faithfulness in NMT models. We provide a measure of faithfulness for NMT based on a variety of stress tests where model parameters are perturbed and measuring faithfulness based on how often the model output changes. We show that our proposed faithfulness measure for NMT models can be improved using a novel differentiable objective that rewards faithful behaviour by the model through probability divergence. Our experimental results on multiple language pairs show that our objective function is effective in increasing faithfulness and can lead to a useful analysis of NMT model behaviour and more trustworthy attention heatmaps. Our proposed objective improves faithfulness without reducing the translation quality and it also seems to have a useful regularization effect on the NMT model and can even improve translation quality in some cases.

2019

pdf bib
Sign Clustering and Topic Extraction in Proto-Elamite
Logan Born | Kate Kelley | Nishant Kambhatla | Carolyn Chen | Anoop Sarkar
Proceedings of the 3rd Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

We describe a first attempt at using techniques from computational linguistics to analyze the undeciphered proto-Elamite script. Using hierarchical clustering, n-gram frequencies, and LDA topic models, we both replicate results obtained by manual decipherment and reveal previously-unobserved relationships between signs. This demonstrates the utility of these techniques as an aid to manual decipherment.

pdf bib
Interrogating the Explanatory Power of Attention in Neural Machine Translation
Pooya Moradi | Nishant Kambhatla | Anoop Sarkar
Proceedings of the 3rd Workshop on Neural Generation and Translation

Attention models have become a crucial component in neural machine translation (NMT). They are often implicitly or explicitly used to justify the model’s decision in generating a specific token but it has not yet been rigorously established to what extent attention is a reliable source of information in NMT. To evaluate the explanatory power of attention for NMT, we examine the possibility of yielding the same prediction but with counterfactual attention models that modify crucial aspects of the trained attention model. Using these counterfactual attention mechanisms we assess the extent to which they still preserve the generation of function and content words in the translation process. Compared to a state of the art attention model, our counterfactual attention models produce 68% of function words and 21% of content words in our German-English dataset. Our experiments demonstrate that attention models by themselves cannot reliably explain the decisions made by a NMT model.

2018

pdf bib
Decipherment of Substitution Ciphers with Neural Language Models
Nishant Kambhatla | Anahita Mansouri Bigvand | Anoop Sarkar
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

Decipherment of homophonic substitution ciphers using language models is a well-studied task in NLP. Previous work in this topic scores short local spans of possible plaintext decipherments using n-gram language models. The most widely used technique is the use of beam search with n-gram language models proposed by Nuhn et al.(2013). We propose a beam search algorithm that scores the entire candidate plaintext at each step of the decipherment using a neural language model. We augment beam search with a novel rest cost estimation that exploits the prediction power of a neural language model. We compare against the state of the art n-gram based methods on many different decipherment tasks. On challenging ciphers such as the Beale cipher we provide significantly better error rates with much smaller beam sizes.

pdf bib
Decipherment for Adversarial Offensive Language Detection
Zhelun Wu | Nishant Kambhatla | Anoop Sarkar
Proceedings of the 2nd Workshop on Abusive Language Online (ALW2)

Automated filters are commonly used by online services to stop users from sending age-inappropriate, bullying messages, or asking others to expose personal information. Previous work has focused on rules or classifiers to detect and filter offensive messages, but these are vulnerable to cleverly disguised plaintext and unseen expressions especially in an adversarial setting where the users can repeatedly try to bypass the filter. In this paper, we model the disguised messages as if they are produced by encrypting the original message using an invented cipher. We apply automatic decipherment techniques to decode the disguised malicious text, which can be then filtered using rules or classifiers. We provide experimental results on three different datasets and show that decipherment is an effective tool for this task.