Coreference in Prague Czech-English Dependency Treebank
Anna Nedoluzhko, Michal Novák, Silvie Cinková, Marie Mikulová, Jiří Mírovský
Abstract
We present coreference annotation on parallel Czech-English texts of the Prague Czech-English Dependency Treebank (PCEDT). The paper describes innovations made to PCEDT 2.0 concerning coreference, as well as coreference information already present there. We characterize the coreference annotation scheme, give the statistics and compare our annotation with the coreference annotation in Ontonotes and Prague Dependency Treebank for Czech. We also present the experiments made using this corpus to improve the alignment of coreferential expressions, which helps us to collect better statistics of correspondences between types of coreferential relations in Czech and English. The corpus released as PCEDT 2.0 Coref is publicly available.- Anthology ID:
- L16-1026
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 169–176
- Language:
- URL:
- https://aclanthology.org/L16-1026
- DOI:
- Cite (ACL):
- Anna Nedoluzhko, Michal Novák, Silvie Cinková, Marie Mikulová, and Jiří Mírovský. 2016. Coreference in Prague Czech-English Dependency Treebank. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 169–176, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Coreference in Prague Czech-English Dependency Treebank (Nedoluzhko et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/L16-1026.pdf
- Code
- ufal/pcedt2.0-coref