Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models

Naoki Iinuma; Makoto Miwa; Yutaka Sasaki

doi:10.18653/v1/2022.bionlp-1.16

Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models

Naoki Iinuma, Makoto Miwa, Yutaka Sasaki

Abstract

This paper proposes novel drug-protein relation extraction models that indirectly utilize distant supervision data. Concretely, instead of adding distant supervision data to the manually annotated training data, our models incorporate distantly supervised models that are relation extraction models trained with distant supervision data. Distantly supervised learning has been proposed to generate a large amount of pseudo-training data at low cost. However, there is still a problem of low prediction performance due to the inclusion of mislabeled data. Therefore, several methods have been proposed to suppress the effects of noisy cases by utilizing some manually annotated training data. However, their performance is lower than that of supervised learning on manually annotated data because mislabeled data that cannot be fully suppressed becomes noise when training the model. To overcome this issue, our methods indirectly utilize distant supervision data with manually annotated training data. The experimental results on the DrugProt corpus in the BioCreative VII Track 1 showed that our proposed model can consistently improve the supervised models in different settings.

Anthology ID:: 2022.bionlp-1.16
Volume:: Proceedings of the 21st Workshop on Biomedical Language Processing
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Venue:: BioNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 161–170
Language:
URL:: https://aclanthology.org/2022.bionlp-1.16
DOI:: 10.18653/v1/2022.bionlp-1.16
Bibkey:
Cite (ACL):: Naoki Iinuma, Makoto Miwa, and Yutaka Sasaki. 2022. Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models. In Proceedings of the 21st Workshop on Biomedical Language Processing, pages 161–170, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Improving Supervised Drug-Protein Relation Extraction with Distantly Supervised Models (Iinuma et al., BioNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/auto-file-uploads/2022.bionlp-1.16.pdf
Video:: https://preview.aclanthology.org/auto-file-uploads/2022.bionlp-1.16.mp4

PDF Search Video