Automatically Labeled Data Generation for Large Scale Event Extraction

Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, Jun Zhao


Abstract
Modern models of event extraction for tasks like ACE are based on supervised learning of events from small hand-labeled data. However, hand-labeled training data is expensive to produce, in low coverage of event types, and limited in size, which makes supervised methods hard to extract large scale of events for knowledge base population. To solve the data labeling problem, we propose to automatically label training data for event extraction via world knowledge and linguistic knowledge, which can detect key arguments and trigger words for each event type and employ them to label events in texts automatically. The experimental results show that the quality of our large scale automatically labeled data is competitive with elaborately human-labeled data. And our automatically labeled data can incorporate with human-labeled data, then improve the performance of models learned from these data.
Anthology ID:
P17-1038
Volume:
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2017
Address:
Vancouver, Canada
Editors:
Regina Barzilay, Min-Yen Kan
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
409–419
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/P17-1038/
DOI:
10.18653/v1/P17-1038
Bibkey:
Cite (ACL):
Yubo Chen, Shulin Liu, Xiang Zhang, Kang Liu, and Jun Zhao. 2017. Automatically Labeled Data Generation for Large Scale Event Extraction. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 409–419, Vancouver, Canada. Association for Computational Linguistics.
Cite (Informal):
Automatically Labeled Data Generation for Large Scale Event Extraction (Chen et al., ACL 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/P17-1038.pdf
Data
FrameNet