@inproceedings{aranberri-2024-analysis,
    title = "Analysis of the Annotations from a Crowd {MT} Evaluation Initiative: Case Study for the {S}panish-{B}asque Pair",
    author = "Aranberri, Nora",
    editor = "Scarton, Carolina  and
      Prescott, Charlotte  and
      Bayliss, Chris  and
      Oakley, Chris  and
      Wright, Joanna  and
      Wrigley, Stuart  and
      Song, Xingyi  and
      Gow-Smith, Edward  and
      Bawden, Rachel  and
      S{\'a}nchez-Cartagena, V{\'i}ctor M  and
      Cadwell, Patrick  and
      Lapshinova-Koltunski, Ekaterina  and
      Cabarr{\~a}o, Vera  and
      Chatzitheodorou, Konstantinos  and
      Nurminen, Mary  and
      Kanojia, Diptesh  and
      Moniz, Helena",
    booktitle = "Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1)",
    month = jun,
    year = "2024",
    address = "Sheffield, UK",
    publisher = "European Association for Machine Translation (EAMT)",
    url = "https://preview.aclanthology.org/ingest-emnlp/2024.eamt-1.44/",
    pages = "548--559",
    abstract = "With the advent and success of trainable automatic evaluation metrics, creating annotated machine translation evaluation data sets is increasingly relevant. However, for low-resource languages, gathering such data can be challenging and further insights into evaluation design for opportunistic scenarios are necessary. In this work we explore an evaluation initiative that targets the Spanish{---}-Basque language pair to study the impact of design decisions and the reliability of volunteer contributions. To do that, we compare the work carried out by volunteers and a translation professional in terms of evaluation results and evaluator agreement and examine the control measures used to ensure reliability. Results show similar behaviour regarding general quality assessment but underscore the need for more informative working environments to make evaluation processes more reliable as well as the need for carefully crafted control cases."
}Markdown (Informal)
[Analysis of the Annotations from a Crowd MT Evaluation Initiative: Case Study for the Spanish-Basque Pair](https://preview.aclanthology.org/ingest-emnlp/2024.eamt-1.44/) (Aranberri, EAMT 2024)
ACL