The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts
Rachele Sprugnoli, Tommaso Caselli, Sara Tonelli, Giovanni Moretti
Abstract
This paper presents a new resource, called Content Types Dataset, to promote the analysis of texts as a composition of units with specific semantic and functional roles. By developing this dataset, we also introduce a new NLP task for the automatic classification of Content Types. The annotation scheme and the dataset are described together with two sets of classification experiments.- Anthology ID:
- E17-2042
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 260–266
- Language:
- URL:
- https://aclanthology.org/E17-2042
- DOI:
- Cite (ACL):
- Rachele Sprugnoli, Tommaso Caselli, Sara Tonelli, and Giovanni Moretti. 2017. The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 260–266, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts (Sprugnoli et al., EACL 2017)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/E17-2042.pdf