Understanding the Impact of UGC Specificities on Translation Quality

José Carlos Rosales Núñez; Djamé Seddah; Guillaume Wisniewski

doi:10.18653/v1/2021.wnut-1.22

Understanding the Impact of UGC Specificities on Translation Quality

José Carlos Rosales Núñez, Djamé Seddah, Guillaume Wisniewski

Abstract

This work takes a critical look at the evaluation of user-generated content automatic translation, the well-known specificities of which raise many challenges for MT. Our analyses show that measuring the average-case performance using a standard metric on a UGC test set falls far short of giving a reliable image of the UGC translation quality. That is why we introduce a new data set for the evaluation of UGC translation in which UGC specificities have been manually annotated using a fine-grained typology. Using this data set, we conduct several experiments to measure the impact of different kinds of UGC specificities on translation quality, more precisely than previously possible.

Anthology ID:: 2021.wnut-1.22
Volume:: Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021)
Month:: November
Year:: 2021
Address:: Online
Venue:: WNUT
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 189–198
Language:
URL:: https://aclanthology.org/2021.wnut-1.22
DOI:: 10.18653/v1/2021.wnut-1.22
Bibkey:
Cite (ACL):: José Carlos Rosales Núñez, Djamé Seddah, and Guillaume Wisniewski. 2021. Understanding the Impact of UGC Specificities on Translation Quality. In Proceedings of the Seventh Workshop on Noisy User-generated Text (W-NUT 2021), pages 189–198, Online. Association for Computational Linguistics.
Cite (Informal):: Understanding the Impact of UGC Specificities on Translation Quality (Rosales Núñez et al., WNUT 2021)
Copy Citation:
PDF:: https://preview.aclanthology.org/nodalida-main-page/2021.wnut-1.22.pdf
Data: MTNT

PDF Search