Prajjwal Bhargava

2021

pdf abs
Generalization in NLI: Ways (Not) To Go Beyond Simple Heuristics
Prajjwal Bhargava | Aleksandr Drozd | Anna Rogers
Proceedings of the Second Workshop on Insights from Negative Results in NLP

Much of recent progress in NLU was shown to be due to models’ learning dataset-specific heuristics. We conduct a case study of generalization in NLI (from MNLI to the adversarially constructed HANS dataset) in a range of BERT-based architectures (adapters, Siamese Transformers, HEX debiasing), as well as with subsampling the data and increasing the model size. We report 2 successful and 3 unsuccessful strategies, all providing insights into how Transformer-based models learn to generalize.

2020

pdf abs
Adaptive Transformers for Learning Multimodal Representations
Prajjwal Bhargava
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

The usage of transformers has grown from learning about language semantics to forming meaningful visiolinguistic representations. These architectures are often over-parametrized, requiring large amounts of computation. In this work, we extend adaptive approaches to learn more about model interpretability and computational efficiency. Specifically, we study attention spans, sparse, and structured dropout methods to help understand how their attention mechanism extends for vision and language tasks. We further show that these approaches can help us learn more about how the network perceives the complexity of input sequences, sparsity preferences for different modalities, and other related phenomena.

Co-authors

Aleksandr Drozd 1
Anna Rogers 1

Venues

acl1
insights1