Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain

Lukas Lange; Heike Adel; Jannik Strötgen

doi:10.18653/v1/2020.acl-main.621

Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain

Lukas Lange, Heike Adel, Jannik Strötgen

Abstract

Exploiting natural language processing in the clinical domain requires de-identification, i.e., anonymization of personal information in texts. However, current research considers de-identification and downstream tasks, such as concept extraction, only in isolation and does not study the effects of de-identification on other tasks. In this paper, we close this gap by reporting concept extraction performance on automatically anonymized data and investigating joint models for de-identification and concept extraction. In particular, we propose a stacked model with restricted access to privacy sensitive information and a multitask model. We set the new state of the art on benchmark datasets in English (96.1% F1 for de-identification and 88.9% F1 for concept extraction) and Spanish (91.4% F1 for concept extraction).

Anthology ID:: 2020.acl-main.621
Volume:: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics
Month:: July
Year:: 2020
Address:: Online
Editors:: Dan Jurafsky, Joyce Chai, Natalie Schluter, Joel Tetreault
Venue:: ACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 6945–6952
Language:
URL:: https://aclanthology.org/2020.acl-main.621
DOI:: 10.18653/v1/2020.acl-main.621
Bibkey:
Cite (ACL):: Lukas Lange, Heike Adel, and Jannik Strötgen. 2020. Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6945–6952, Online. Association for Computational Linguistics.
Cite (Informal):: Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain (Lange et al., ACL 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/2020.acl-main.621.pdf
Video:: http://slideslive.com/38928788
Code: boschresearch/joint_anonymization_extraction

PDF Search Code Video