DEFT: A corpus for definition extraction in free- and semi-structured text
Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, Carl Dockhorn
Abstract
Definition extraction has been a popular topic in NLP research for well more than a decade, but has been historically limited to well-defined, structured, and narrow conditions. In reality, natural language is messy, and messy data requires both complex solutions and data that reflects that reality. In this paper, we present a robust English corpus and annotation schema that allows us to explore the less straightforward examples of term-definition structures in free and semi-structured text.- Anthology ID:
- W19-4015
- Volume:
- Proceedings of the 13th Linguistic Annotation Workshop
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Annemarie Friedrich, Deniz Zeyrek, Jet Hoek
- Venue:
- LAW
- SIG:
- SIGANN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 124–131
- Language:
- URL:
- https://aclanthology.org/W19-4015
- DOI:
- 10.18653/v1/W19-4015
- Cite (ACL):
- Sasha Spala, Nicholas A. Miller, Yiming Yang, Franck Dernoncourt, and Carl Dockhorn. 2019. DEFT: A corpus for definition extraction in free- and semi-structured text. In Proceedings of the 13th Linguistic Annotation Workshop, pages 124–131, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- DEFT: A corpus for definition extraction in free- and semi-structured text (Spala et al., LAW 2019)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/W19-4015.pdf
- Data
- DEFT Corpus