Karthik Raman
2020
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
Jiecao Chen
|
Liu Yang
|
Karthik Raman
|
Michael Bendersky
|
Jung-Jung Yeh
|
Yun Zhou
|
Marc Najork
|
Danyang Cai
|
Ehsan Emadzadeh
Findings of the Association for Computational Linguistics: EMNLP 2020
Pre-trained models like BERT ((Devlin et al., 2018) have dominated NLP / IR applications such as single sentence classification, text pair classification, and question answering. However, deploying these models in real systems is highly non-trivial due to their exorbitant computational costs. A common remedy to this is knowledge distillation (Hinton et al., 2015), leading to faster inference. However – as we show here – existing works are not optimized for dealing with pairs (or tuples) of texts. Consequently, they are either not scalable or demonstrate subpar performance. In this work, we propose DiPair — a novel framework for distilling fast and accurate models on text pair tasks. Coupled with an end-to-end training strategy, DiPair is both highly scalable and offers improved quality-speed tradeoffs. Empirical studies conducted on both academic and real-world e-commerce benchmarks demonstrate the efficacy of the proposed approach with speedups of over 350x and minimal quality drop relative to the cross-attention teacher BERT model.
2019
Learning Multilingual Word Embeddings Using Image-Text Data
Karan Singhal
|
Karthik Raman
|
Balder ten Cate
Proceedings of the Second Workshop on Shortcomings in Vision and Language
There has been significant interest recently in learning multilingual word embeddings – in which semantically similar words across languages have similar embeddings. State-of-the-art approaches have relied on expensive labeled data, which is unavailable for low-resource languages, or have involved post-hoc unification of monolingual embeddings. In the present paper, we investigate the efficacy of multilingual embeddings learned from weakly-supervised image-text data. In particular, we propose methods for learning multilingual embeddings using image-text data, by enforcing similarity between the representations of the image and that of the text. Our experiments reveal that even without using any expensive labeled data, a bag-of-words-based embedding model trained on image-text data achieves performance comparable to the state-of-the-art on crosslingual semantic similarity tasks.
2010
Multilingual Pseudo-Relevance Feedback: Performance Study of Assisting Languages
Manoj Kumar Chinnakotla
|
Karthik Raman
|
Pushpak Bhattacharyya
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Search
Co-authors
- Karan Singhal 1
- Balder ten Cate 1
- Jiecao Chen 1
- Liu Yang 1
- Michael Bendersky 1
- show all...