QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation

Zhenrui Yue, Huimin Zeng, Bernhard Kratzwald, Stefan Feuerriegel, Dong Wang


Abstract
Question answering (QA) has recently shown impressive results for answering questions from customized domains. Yet, a common challenge is to adapt QA models to an unseen target domain. In this paper, we propose a novel self-supervised framework called QADA for QA domain adaptation. QADA introduces a novel data augmentation pipeline used to augment training QA samples. Different from existing methods, we enrich the samples via hidden space augmentation. For questions, we introduce multi-hop synonyms and sample augmented token embeddings with Dirichlet distributions. For contexts, we develop an augmentation method which learns to drop context spans via a custom attentive sampling strategy. Additionally, contrastive learning is integrated in the proposed self-supervised adaptation framework QADA. Unlike existing approaches, we generate pseudo labels and propose to train the model via a novel attention-based contrastive adaptation method. The attention weights are used to build informative features for discrepancy estimation that helps the QA model separate answers and generalize across source and target domains. To the best of our knowledge, our work is the first to leverage hidden space augmentation and attention-based contrastive adaptation for self-supervised domain adaptation in QA. Our evaluation shows that QADA achieves considerable improvements on multiple target datasets over state-of-the-art baselines in QA domain adaptation.
Anthology ID:
2022.emnlp-main.147
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2308–2321
Language:
URL:
https://aclanthology.org/2022.emnlp-main.147
DOI:
Bibkey:
Cite (ACL):
Zhenrui Yue, Huimin Zeng, Bernhard Kratzwald, Stefan Feuerriegel, and Dong Wang. 2022. QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2308–2321, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
QA Domain Adaptation using Hidden Space Augmentation and Self-Supervised Contrastive Adaptation (Yue et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.emnlp-main.147.pdf