ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision

Ming Zhong, Siru Ouyang, Minhao Jiang, Vivian Hu, Yizhu Jiao, Xuan Wang, Jiawei Han


Abstract
Structured chemical reaction information plays a vital role for chemists engaged in laboratory work and advanced endeavors such as computer-aided drug design. Despite the importance of extracting structured reactions from scientific literature, data annotation for this purpose is cost-prohibitive due to the significant labor required from domain experts. Consequently, the scarcity of sufficient training data poses an obstacle to the progress of related models in this domain. In this paper, we propose ReactIE, which combines two weakly supervised approaches for pre-training. Our method utilizes frequent patterns within the text as linguistic cues to identify specific characteristics of chemical reactions. Additionally, we adopt synthetic data from patent records as distant supervision to incorporate domain knowledge into the model. Experiments demonstrate that ReactIE achieves substantial improvements and outperforms all existing baselines.
Anthology ID:
2023.findings-acl.767
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
12120–12130
Language:
URL:
https://aclanthology.org/2023.findings-acl.767
DOI:
10.18653/v1/2023.findings-acl.767
Bibkey:
Cite (ACL):
Ming Zhong, Siru Ouyang, Minhao Jiang, Vivian Hu, Yizhu Jiao, Xuan Wang, and Jiawei Han. 2023. ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision. In Findings of the Association for Computational Linguistics: ACL 2023, pages 12120–12130, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
ReactIE: Enhancing Chemical Reaction Extraction with Weak Supervision (Zhong et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-5/2023.findings-acl.767.pdf