@inproceedings{hayakawa-arase-2020-fine,
    title = "Fine-Grained Error Analysis on {E}nglish-to-{J}apanese Machine Translation in the Medical Domain",
    author = "Hayakawa, Takeshi  and
      Arase, Yuki",
    editor = "Martins, Andr{\'e}  and
      Moniz, Helena  and
      Fumega, Sara  and
      Martins, Bruno  and
      Batista, Fernando  and
      Coheur, Luisa  and
      Parra, Carla  and
      Trancoso, Isabel  and
      Turchi, Marco  and
      Bisazza, Arianna  and
      Moorkens, Joss  and
      Guerberof, Ana  and
      Nurminen, Mary  and
      Marg, Lena  and
      Forcada, Mikel L.",
    booktitle = "Proceedings of the 22nd Annual Conference of the European Association for Machine Translation",
    month = nov,
    year = "2020",
    address = "Lisboa, Portugal",
    publisher = "European Association for Machine Translation",
    url = "https://preview.aclanthology.org/ingest-emnlp/2020.eamt-1.17/",
    pages = "155--164",
    abstract = "We performed a detailed error analysis in domain-specific neural machine translation (NMT) for the English and Japanese language pair with fine-grained manual annotation. Despite its importance for advancing NMT technologies, research on the performance of domain-specific NMT and non-European languages has been limited. In this study, we designed an error typology based on the error types that were typically generated by NMT systems and might cause significant impact in technical translations: ``Addition,'' ``Omission,'' ``Mistranslation,'' ``Grammar,'' and ``Terminology.'' The error annotation was targeted to the medical domain and was performed by experienced professional translators specialized in medicine under careful quality control. The annotation detected 4,912 errors on 2,480 sentences, and the frequency and distribution of errors were analyzed. We found that the major errors in NMT were ``Mistranslation'' and ``Terminology'' rather than ``Addition'' and ``Omission,'' which have been reported as typical problems of NMT. Interestingly, more errors occurred in documents for professionals compared with those for the general public. The results of our annotation work will be published as a parallel corpus with error labels, which are expected to contribute to developing better NMT models, automatic evaluation metrics, and quality estimation models."
}