Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift

Ajay Nagesh, Mihai Surdeanu


Abstract
We propose a novel approach to semi-supervised learning for information extraction that uses ladder networks (Rasmus et al., 2015). In particular, we focus on the task of named entity classification, defined as identifying the correct label (e.g., person or organization name) of an entity mention in a given context. Our approach is simple, efficient and has the benefit of being robust to semantic drift, a dominant problem in most semi-supervised learning systems. We empirically demonstrate the superior performance of our system compared to the state-of-the-art on two standard datasets for named entity classification. We obtain between 62% and 200% improvement over the state-of-art baseline on these two datasets.
Anthology ID:
N18-2057
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marilyn Walker, Heng Ji, Amanda Stent
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
352–358
Language:
URL:
https://aclanthology.org/N18-2057
DOI:
10.18653/v1/N18-2057
Bibkey:
Cite (ACL):
Ajay Nagesh and Mihai Surdeanu. 2018. Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 352–358, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Keep Your Bearings: Lightly-Supervised Information Extraction with Ladder Networks That Avoids Semantic Drift (Nagesh & Surdeanu, NAACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/N18-2057.pdf
Data
OntoNotes 5.0