Syntactically Aware Neural Architectures for Definition Extraction

Luis Espinosa-Anke, Steven Schockaert


Abstract
Automatically identifying definitional knowledge in text corpora (Definition Extraction or DE) is an important task with direct applications in, among others, Automatic Glossary Generation, Taxonomy Learning, Question Answering and Semantic Search. It is generally cast as a binary classification problem between definitional and non-definitional sentences. In this paper we present a set of neural architectures combining Convolutional and Recurrent Neural Networks, which are further enriched by incorporating linguistic information via syntactic dependencies. Our experimental results in the task of sentence classification, on two benchmarking DE datasets (one generic, one domain-specific), show that these models obtain consistent state of the art results. Furthermore, we demonstrate that models trained on clean Wikipedia-like definitions can successfully be applied to more noisy domain-specific corpora.
Anthology ID:
N18-2061
Volume:
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
378–385
Language:
URL:
https://aclanthology.org/N18-2061
DOI:
10.18653/v1/N18-2061
Bibkey:
Cite (ACL):
Luis Espinosa-Anke and Steven Schockaert. 2018. Syntactically Aware Neural Architectures for Definition Extraction. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers), pages 378–385, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Syntactically Aware Neural Architectures for Definition Extraction (Espinosa-Anke & Schockaert, NAACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/N18-2061.pdf