ZSEE: A Dataset based on Zeolite Synthesis Event Extraction for Automated Synthesis Platform
Song He, Xin Peng, Yihan Cai, Xin Li, Zhiqing Yuan, WenLi Du, Weimin Yang
Abstract
Automated synthesis of zeolite, one of the most important catalysts in chemical industries, holds great significance for attaining economic and environmental benefits. Structural synthesis data extracted through NLP technologies from zeolite experimental procedures can significantly expedite automated synthesis owing to its machine readability. However, the utilization of NLP technologies in information extraction of zeolite synthesis remains restricted due to the lack of annotated datasets. In this paper, we formulate an event extraction task to mine structural synthesis actions from experimental narratives for modular automated synthesis. Furthermore, we introduce ZSEE, a novel dataset containing fine-grained event annotations of zeolite synthesis actions. Our dataset features 16 event types and 13 argument roles which cover all the experimental operational steps of zeolite synthesis. We explore current state-of-the-art event extraction methods on ZSEE, perform error analysis based on the experimental results, and summarize the challenges and corresponding research directions to further facilitate the automated synthesis of zeolites. The code is publicly available at https://github.com/Hi-0317/ZSEE.- Anthology ID:
- 2024.findings-naacl.116
- Volume:
- Findings of the Association for Computational Linguistics: NAACL 2024
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Kevin Duh, Helena Gomez, Steven Bethard
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1791–1808
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-naacl.116/
- DOI:
- 10.18653/v1/2024.findings-naacl.116
- Cite (ACL):
- Song He, Xin Peng, Yihan Cai, Xin Li, Zhiqing Yuan, WenLi Du, and Weimin Yang. 2024. ZSEE: A Dataset based on Zeolite Synthesis Event Extraction for Automated Synthesis Platform. In Findings of the Association for Computational Linguistics: NAACL 2024, pages 1791–1808, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- ZSEE: A Dataset based on Zeolite Synthesis Event Extraction for Automated Synthesis Platform (He et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-naacl.116.pdf