Adapting Event Extractors to Medical Data: Bridging the Covariate Shift

Aakanksha Naik, Jill Fain Lehman, Carolyn Rose


Abstract
We tackle the task of adapting event extractors to new domains without labeled data, by aligning the marginal distributions of source and target domains. As a testbed, we create two new event extraction datasets using English texts from two medical domains: (i) clinical notes, and (ii) doctor-patient conversations. We test the efficacy of three marginal alignment techniques: (i) adversarial domain adaptation (ADA), (ii) domain adaptive fine-tuning (DAFT), and (iii) a new instance weighting technique based on language model likelihood scores (LIW). LIW and DAFT improve over a no-transfer BERT baseline on both domains, but ADA only improves on notes. Deeper analysis of performance under different types of shifts (e.g., lexical shift, semantic shift) explains some of the variations among models. Our best-performing models reach F1 scores of 70.0 and 72.9 on notes and conversations respectively, using no labeled target data.
Anthology ID:
2021.eacl-main.258
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2963–2975
Language:
URL:
https://aclanthology.org/2021.eacl-main.258
DOI:
10.18653/v1/2021.eacl-main.258
Bibkey:
Cite (ACL):
Aakanksha Naik, Jill Fain Lehman, and Carolyn Rose. 2021. Adapting Event Extractors to Medical Data: Bridging the Covariate Shift. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2963–2975, Online. Association for Computational Linguistics.
Cite (Informal):
Adapting Event Extractors to Medical Data: Bridging the Covariate Shift (Naik et al., EACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.eacl-main.258.pdf