Are Ellipses Important for Machine Translation?

Payal Khullar


Abstract
This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint toward a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.
Anthology ID:
2021.cl-4.30
Volume:
Computational Linguistics, Volume 47, Issue 4 - December 2021
Month:
December
Year:
2021
Address:
Cambridge, MA
Venue:
CL
SIG:
Publisher:
MIT Press
Note:
Pages:
927–937
Language:
URL:
https://aclanthology.org/2021.cl-4.30
DOI:
10.1162/coli_a_00414
Bibkey:
Cite (ACL):
Payal Khullar. 2021. Are Ellipses Important for Machine Translation?. Computational Linguistics, 47(4):927–937.
Cite (Informal):
Are Ellipses Important for Machine Translation? (Khullar, CL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2021.cl-4.30.pdf