Abstract
Several corpora annotated for coreference have been made available in the past decade. These resources differ with respect to their size and the underlying structure: the number of domains and their similarity. Our study compares domain-specific models, learned from small heterogeneous subsets of the investigated corpora, against uniform models, that utilize all the available data. We show that for knowledge-poor baseline systems, domain-specific and uniform modeling yield same results. Systems, relying on large amounts of linguistic knowledge, however, exhibit differences in their performance: with all the designed features in use, domain-specific models suffer from over-fitting, whereas with pre-selected feature sets they tend to outperform union models.- Anthology ID:
- L12-1562
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 187–191
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/944_Paper.pdf
- DOI:
- Cite (ACL):
- Olga Uryupina and Massimo Poesio. 2012. Domain-specific vs. Uniform Modeling for Coreference Resolution. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 187–191, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Domain-specific vs. Uniform Modeling for Coreference Resolution (Uryupina & Poesio, LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/944_Paper.pdf