2022
pdf
abs
Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations
Ji Xin
|
Chenyan Xiong
|
Ashwin Srinivasan
|
Ankita Sharma
|
Damien Jose
|
Paul Bennett
Findings of the Association for Computational Linguistics: ACL 2022
Dense retrieval (DR) methods conduct text retrieval by first encoding texts in the embedding space and then matching them by nearest neighbor search. This requires strong locality properties from the representation space, e.g., close allocations of each small group of relevant texts, which are hard to generalize to domains without sufficient training data. In this paper, we aim to improve the generalization ability of DR models from source training domains with rich supervision signals to target domains without any relevance label, in the zero-shot setting. To achieve that, we propose Momentum adversarial Domain Invariant Representation learning (MoDIR), which introduces a momentum method to train a domain classifier that distinguishes source versus target domains, and then adversarially updates the DR encoder to learn domain invariant representations. Our experiments show that MoDIR robustly outperforms its baselines on 10+ ranking datasets collected in the BEIR benchmark in the zero-shot setup, with more than 10% relative gains on datasets with enough sensitivity for DR models’ evaluation. Source code is available at https://github.com/ji-xin/modir.
2019
pdf
abs
Dr.Quad at MEDIQA 2019: Towards Textual Inference and Question Entailment using contextualized representations
Vinayshekhar Bannihatti Kumar
|
Ashwin Srinivasan
|
Aditi Chaudhary
|
James Route
|
Teruko Mitamura
|
Eric Nyberg
Proceedings of the 18th BioNLP Workshop and Shared Task
This paper presents the submissions by TeamDr.Quad to the ACL-BioNLP 2019 shared task on Textual Inference and Question Entailment in the Medical Domain. Our system is based on the prior work Liu et al. (2019) which uses a multi-task objective function for textual entailment. In this work, we explore different strategies for generalizing state-of-the-art language understanding models to the specialized medical domain. Our results on the shared task demonstrate that incorporating domain knowledge through data augmentation is a powerful strategy for addressing challenges posed specialized domains such as medicine.
2007
pdf
USP-IBM-1 and USP-IBM-2: The ILP-based Systems for Lexical Sample WSD in SemEval-2007
Lucia Specia
|
Maria das Graças
|
Volpe Nunes
|
Ashwin Srinivasan
|
Ganesh Ramakrishnan
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)