Detecting Chemical Reactions in Patents

Hiyori Yoshikawa, Dat Quoc Nguyen, Zenan Zhai, Christian Druckenbrodt, Camilo Thorne, Saber A. Akhondi, Timothy Baldwin, Karin Verspoor


Abstract
Extracting chemical reactions from patents is a crucial task for chemists working on chemical exploration. In this paper we introduce the novel task of detecting the textual spans that describe or refer to chemical reactions within patents. We formulate this task as a paragraph-level sequence tagging problem, where the system is required to return a sequence of paragraphs which contain a description of a reaction. To address this new task, we construct an annotated dataset from an existing proprietary database of chemical reactions manually extracted from patents. We introduce several baseline methods for the task and evaluate them over our dataset. Through error analysis, we discuss what makes the task complex and challenging, and suggest possible directions for future research.
Anthology ID:
U19-1014
Volume:
Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association
Month:
4--6 December
Year:
2019
Address:
Sydney, Australia
Venue:
ALTA
SIG:
Publisher:
Australasian Language Technology Association
Note:
Pages:
100–110
Language:
URL:
https://aclanthology.org/U19-1014
DOI:
Bibkey:
Cite (ACL):
Hiyori Yoshikawa, Dat Quoc Nguyen, Zenan Zhai, Christian Druckenbrodt, Camilo Thorne, Saber A. Akhondi, Timothy Baldwin, and Karin Verspoor. 2019. Detecting Chemical Reactions in Patents. In Proceedings of the The 17th Annual Workshop of the Australasian Language Technology Association, pages 100–110, Sydney, Australia. Australasian Language Technology Association.
Cite (Informal):
Detecting Chemical Reactions in Patents (Yoshikawa et al., ALTA 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/U19-1014.pdf