Abstract
We propose a scheme for annotating direct speech in literary texts, based on the Text Encoding Initiative (TEI) and the coreference annotation guidelines from the Message Understanding Conference (MUC). The scheme encodes the speakers and listeners of utterances in a text, as well as the quotative verbs that reports the utterances. We measure inter-annotator agreement on this annotation task. We then present statistics on a manually annotated corpus that consists of books from the New Testament. Finally, we visualize the corpus as a conversational network.- Anthology ID:
- L16-1168
- Volume:
- Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
- Month:
- May
- Year:
- 2016
- Address:
- Portorož, Slovenia
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Sara Goggi, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 1059–1063
- Language:
- URL:
- https://aclanthology.org/L16-1168
- DOI:
- Cite (ACL):
- John Lee and Chak Yan Yeung. 2016. An Annotated Corpus of Direct Speech. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1059–1063, Portorož, Slovenia. European Language Resources Association (ELRA).
- Cite (Informal):
- An Annotated Corpus of Direct Speech (Lee & Yeung, LREC 2016)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/L16-1168.pdf