@inproceedings{specia-etal-2010-dataset,
    title = "A Dataset for Assessing Machine Translation Evaluation Metrics",
    author = "Specia, Lucia  and
      Cancedda, Nicola  and
      Dymetman, Marc",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Odijk, Jan  and
      Piperidis, Stelios  and
      Rosner, Mike  and
      Tapias, Daniel",
    booktitle = "Proceedings of the Seventh International Conference on Language Resources and Evaluation ({LREC}'10)",
    month = may,
    year = "2010",
    address = "Valletta, Malta",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/ingest-emnlp/L10-1349/",
    abstract = "We describe a dataset containing 16,000 translations produced by four machine translation systems and manually annotated for quality by professional translators. This dataset can be used in a range of tasks assessing machine translation evaluation metrics, from basic correlation analysis to training and test of machine learning-based metrics. By providing a standard dataset for such tasks, we hope to encourage the development of better MT evaluation metrics."
}Markdown (Informal)
[A Dataset for Assessing Machine Translation Evaluation Metrics](https://preview.aclanthology.org/ingest-emnlp/L10-1349/) (Specia et al., LREC 2010)
ACL