@inproceedings{lewin-etal-2012-centroids,
    title = "{C}entroids: Gold standards with distributional variation",
    author = "Lewin, Ian  and
      Kafkas, {\c{S}}enay  and
      Rebholz-Schuhmann, Dietrich",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Declerck, Thierry  and
      Do{\u{g}}an, Mehmet U{\u{g}}ur  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Moreno, Asuncion  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Eighth International Conference on Language Resources and Evaluation ({LREC}'12)",
    month = may,
    year = "2012",
    address = "Istanbul, Turkey",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/ingest-emnlp/L12-1364/",
    pages = "3894--3900",
    abstract = "Motivation: Gold Standards for named entities are, ironically, not standard themselves. Some specify the one perfect annotation. Others specify perfectly good alternatives. The concept of Silver standard is relatively new. The objective is consensus rather than perfection. How should the two concepts be best represented and related? Approach: We examine several Biomedical Gold Standards and motivate a new representational format, centroids, which simply and effectively represents name distributions. We define an algorithm for finding centroids, given a set of alternative input annotations and we test the outputs quantitatively and qualitatively. We also define a metric of relatively acceptability on top of the centroid standard. Results: Precision, recall and F-scores of over 0.99 are achieved for the simple sanity check of giving the algorithm Gold Standard inputs. Qualitative analysis of the differences very often reveals errors and incompleteness in the original Gold Standard. Given automatically generated annotations, the centroids effectively represent the range of those contributions and the quality of the centroid annotations is highly competitive with the best of the contributors. Conclusion: Centroids cleanly represent alternative name variations for Silver and Gold Standards. A centroid Silver Standard is derived just like a Gold Standard, only from imperfect inputs."
}Markdown (Informal)
[Centroids: Gold standards with distributional variation](https://preview.aclanthology.org/ingest-emnlp/L12-1364/) (Lewin et al., LREC 2012)
ACL
- Ian Lewin, Şenay Kafkas, and Dietrich Rebholz-Schuhmann. 2012. Centroids: Gold standards with distributional variation. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3894–3900, Istanbul, Turkey. European Language Resources Association (ELRA).