Abstract
The quality of Neural Machine Translation (NMT) has been shown to significantly degrade when confronted with source-side noise. We present the first large-scale study of state-of-the-art English-to-German NMT on real grammatical noise, by evaluating on several Grammar Correction corpora. We present methods for evaluating NMT robustness without true references, and we use them for extensive analysis of the effects that different grammatical errors have on the NMT output. We also introduce a technique for visualizing the divergence distribution caused by a source-side error, which allows for additional insights.- Anthology ID:
- W19-4822
- Volume:
- Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Tal Linzen, Grzegorz Chrupała, Yonatan Belinkov, Dieuwke Hupkes
- Venue:
- BlackboxNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 213–223
- Language:
- URL:
- https://aclanthology.org/W19-4822
- DOI:
- 10.18653/v1/W19-4822
- Cite (ACL):
- Antonios Anastasopoulos. 2019. An Analysis of Source-Side Grammatical Errors in NMT. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 213–223, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- An Analysis of Source-Side Grammatical Errors in NMT (Anastasopoulos, BlackboxNLP 2019)
- PDF:
- https://preview.aclanthology.org/proper-vol2-ingestion/W19-4822.pdf
- Data
- FCE, JFLEG