DEFT: A corpus for definition extraction in free- and semi-structured text

Sasha Spala; Nicholas A. Miller; Yiming Yang; Franck Dernoncourt; Carl Dockhorn

doi:10.18653/v1/W19-4015

DEFT: A corpus for definition extraction in free- and semi-structured text

Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, Carl Dockhorn

Abstract

Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.

Anthology ID:: W19-4015
Volume:: Proceedings of the 13th Linguistic Annotation Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Annemarie Friedrich, Deniz Zeyrek, Jet Hoek
Venue:: LAW
SIG:: SIGANN
Publisher:: Association for Computational Linguistics
Note:
Pages:: 124–131
Language:
URL:: https://aclanthology.org/W19-4015
DOI:: 10.18653/v1/W19-4015
Bibkey:
Cite (ACL):: Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, and Carl Dockhorn. 2019. DEFT: A corpus for definition extraction in free- and semi-structured text. In Proceedings of the 13th Linguistic Annotation Workshop, pages 124–131, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: DEFT: A corpus for definition extraction in free- and semi-structured text (Spala et al., LAW 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-4/W19-4015.pdf
Data: DEFT Corpus

PDF Search