Nadja Schauffler


2014

pdf bib
The Extended DIRNDL Corpus as a Resource for Coreference and Bridging Resolution
Anders Björkelund | Kerstin Eckart | Arndt Riester | Nadja Schauffler | Katrin Schweitzer
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

DIRNDL is a spoken and written corpus based on German radio news, which features coreference and information-status annotation (including bridging anaphora and their antecedents), as well as prosodic information. We have recently extended DIRNDL with a fine-grained two-dimensional information status labeling scheme. We have also applied a state-of-the-art part-of-speech and morphology tagger to the corpus, as well as highly accurate constituency and dependency parsers. In the light of this development we believe that DIRNDL is an interesting resource for NLP researchers working on automatic coreference and bridging resolution. In order to enable and promote usage of the data, we make it available for download in an accessible tabular format, compatible with the formats used in the CoNLL and SemEval shared tasks on automatic coreference resolution.