CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese

Rita de Carvalho, Andreia Querido, Marisa Campos, Rita Valadas Pereira, João Silva, António Branco


Abstract
This paper presents a new linguistic resource for the study and computational processing of Portuguese. CINTIL DependencyBank PREMIUM is a corpus of Portuguese news text, accurately manually annotated with a wide range of linguistic information (morpho-syntax, named-entities, syntactic function and semantic roles), making it an invaluable resource specially for the development and evaluation of data-driven natural language processing tools. The corpus is under active development, reaching 4,000 sentences in its current version. The paper also reports on the training and evaluation of a dependency parser over this corpus. CINTIL DependencyBank PREMIUM is freely-available for research purposes through META-SHARE.
Anthology ID:
L16-1246
Volume:
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
Month:
May
Year:
2016
Address:
Portorož, Slovenia
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
1552–1557
Language:
URL:
https://aclanthology.org/L16-1246
DOI:
Bibkey:
Cite (ACL):
Rita de Carvalho, Andreia Querido, Marisa Campos, Rita Valadas Pereira, João Silva, and António Branco. 2016. CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1552–1557, Portorož, Slovenia. European Language Resources Association (ELRA).
Cite (Informal):
CINTIL DependencyBank PREMIUM - A Corpus of Grammatical Dependencies for Portuguese (de Carvalho et al., LREC 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/L16-1246.pdf