The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

Jie He; Tao Wang; Deyi Xiong; Qun Liu

doi:10.18653/v1/2020.findings-emnlp.327

The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation

Abstract

Does neural machine translation yield translations that are congenial with common sense? In this paper, we present a test suite to evaluate the commonsense reasoning capability of neural machine translation. The test suite consists of three test sets, covering lexical and contextless/contextual syntactic ambiguity that requires commonsense knowledge to resolve. We manually create 1,200 triples, each of which contain a source sentence and two contrastive translations, involving 7 different common sense types. Language models pretrained on large-scale corpora, such as BERT, GPT-2, achieve a commonsense reasoning accuracy of lower than 72% on target translations of this test suite. We conduct extensive experiments on the test suite to evaluate commonsense reasoning in neural machine translation and investigate factors that have impact on this capability. Our experiments and analyses demonstrate that neural machine translation performs poorly on commonsense reasoning of the three ambiguity types in terms of both reasoning accuracy ( 6 60.1%) and reasoning consistency (6 31%). We will release our test suite as a machine translation commonsense reasoning testbed to promote future work in this direction.

Anthology ID:: 2020.findings-emnlp.327
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 3662–3672
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.327
DOI:: 10.18653/v1/2020.findings-emnlp.327
Bibkey:
Cite (ACL):: Jie He, Tao Wang, Deyi Xiong, and Qun Liu. 2020. The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 3662–3672, Online. Association for Computational Linguistics.
Cite (Informal):: The Box is in the Pen: Evaluating Commonsense Reasoning in Neural Machine Translation (He et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/add_acl24_videos/2020.findings-emnlp.327.pdf
Code: tjunlp-lab/commonmt
Data: CommonMT

PDF Search Code