Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel
Hardik Vala, Stefan Dimitrov, David Jurgens, Andrew Piper, Derek Ruths
Abstract
Characters form the focus of various studies of literary works, including social network analysis, archetype induction, and plot comparison. The recent rise in the computational modelling of literary works has produced a proportional rise in the demand for character-annotated literary corpora. However, automatically identifying characters is an open problem and there is low availability of literary texts with manually labelled characters. To address the latter problem, this work presents three contributions: (1) a comprehensive scheme for manually resolving mentions to characters in texts. (2) A novel collaborative annotation tool, CHARLES (CHAracter Resolution Label-Entry System) for character annotation and similiar cross-document tagging tasks. (3) The character annotations resulting from a pilot study on the novel Pride and Prejudice, demonstrating the scheme and tool facilitate the efficient production of high-quality annotations. We expect this work to motivate the further production of annotated literary corpora to help meet the demand of the community.- Anthology ID:
- L16-1028
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 184–189
- Language:
- URL:
- https://aclanthology.org/L16-1028
- DOI:
- Cite (ACL):
- Hardik Vala, Stefan Dimitrov, David Jurgens, Andrew Piper, and Derek Ruths. 2016. Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 184–189, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- Annotating Characters in Literary Corpora: A Scheme, the CHARLES Tool, and an Annotated Novel (Vala et al., LREC 2016)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/L16-1028.pdf