BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

Ishani Mondal

doi:10.18653/v1/2021.naacl-main.423

BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification

Abstract

Healthcare predictive analytics aids medical decision-making, diagnosis prediction and drug review analysis. Therefore, prediction accuracy is an important criteria which also necessitates robust predictive language models. However, the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans. Recent efforts of generating adversaries using rule-based synonyms and BERT-MLMs have been witnessed in general domain, but the ever-increasing biomedical literature poses unique challenges. We propose BBAEG (Biomedical BERT-based Adversarial Example Generation), a black-box attack algorithm for biomedical text classification, leveraging the strengths of both domain-specific synonym replacement for biomedical named entities and BERT-MLM predictions, spelling variation and number replacement. Through automatic and human evaluation on two datasets, we demonstrate that BBAEG performs stronger attack with better language fluency, semantic coherence as compared to prior work.

Anthology ID:: 2021.naacl-main.423
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 5378–5384
Language:
URL:: https://aclanthology.org/2021.naacl-main.423
DOI:: 10.18653/v1/2021.naacl-main.423
Bibkey:
Cite (ACL):: Ishani Mondal. 2021. BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5378–5384, Online. Association for Computational Linguistics.
Cite (Informal):: BBAEG: Towards BERT-based Biomedical Adversarial Example Generation for Text Classification (Mondal, NAACL 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.423.pdf
Optional supplementary data:: 2021.naacl-main.423.OptionalSupplementaryData.pdf
Video:: https://preview.aclanthology.org/nschneid-patch-5/2021.naacl-main.423.mp4
Code: Ishani-Mondal/BBAEG

PDF Search Code Optional supplementary data Video