Semi-Supervised Event Extraction with Paraphrase Clusters
James Ferguson, Colin Lockard, Daniel Weld, Hannaneh Hajishirzi
Abstract
Supervised event extraction systems are limited in their accuracy due to the lack of available training data. We present a method for self-training event extraction systems by bootstrapping additional training data. This is done by taking advantage of the occurrence of multiple mentions of the same event instances across newswire articles from multiple sources. If our system can make a high-confidence extraction of some mentions in such a cluster, it can then acquire diverse training examples by adding the other mentions as well. Our experiments show significant performance improvements on multiple event extractors over ACE 2005 and TAC-KBP 2015 datasets.- Anthology ID:
- N18-2058
- Volume:
- Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana
- Editors:
- Marilyn Walker, Heng Ji, Amanda Stent
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 359–364
- Language:
- URL:
- https://aclanthology.org/N18-2058
- DOI:
- 10.18653/v1/N18-2058
- Cite (ACL):
- James Ferguson, Colin Lockard, Daniel Weld, and Hannaneh Hajishirzi. 2018. Semi-Supervised Event Extraction with Paraphrase Clusters. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 359–364, New Orleans, Louisiana. Association for Computational Linguistics.
- Cite (Informal):
- Semi-Supervised Event Extraction with Paraphrase Clusters (Ferguson et al., NAACL 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/N18-2058.pdf