CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

Bill Yuchen Lin; Wangchunshu Zhou; Ming Shen; Pei Zhou; Chandra Bhagavatula; Yejin Choi; Xiang Ren

doi:10.18653/v1/2020.findings-emnlp.165

CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning

Bill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, Xiang Ren

Abstract

Recently, large-scale pre-trained language models have demonstrated impressive performance on several commonsense-reasoning benchmark datasets. However, building machines with commonsense to compose realistically plausible sentences remains challenging. In this paper, we present a constrained text generation task, CommonGen associated with a benchmark dataset, to explicitly test machines for the ability of generative commonsense reasoning. Given a set of common concepts (e.g., dog, frisbee, catch, throw); the task is to generate a coherent sentence describing an everyday scenario using these concepts (e.g., “a man throws a frisbee and his dog catches it”). The CommonGen task is challenging because it inherently requires 1) relational reasoning with background commonsense knowledge and 2) compositional generalization ability to work on unseen concept combinations. Our dataset, constructed through a combination of crowdsourced and existing caption corpora, consists of 77k commonsense descriptions over 35k unique concept-sets. Experiments show that there is a large gap between state-of-the-art text generation models (e.g., T5) and human performance (31.6% v.s. 63.5% in SPICE metric). Furthermore, we demonstrate that the learned generative commonsense reasoning capability can be transferred to improve downstream tasks such as CommonsenseQA (76.9% to 78.4 in dev accuracy) by generating additional context.

Anthology ID:: 2020.findings-emnlp.165
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Venues:: EMNLP | Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1823–1840
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.165
DOI:: 10.18653/v1/2020.findings-emnlp.165
Bibkey:
Cite (ACL):: Bill Yuchen Lin, Wangchunshu Zhou, Ming Shen, Pei Zhou, Chandra Bhagavatula, Yejin Choi, and Xiang Ren. 2020. CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1823–1840, Online. Association for Computational Linguistics.
Cite (Informal):: CommonGen: A Constrained Text Generation Challenge for Generative Commonsense Reasoning (Lin et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/update-css-js/2020.findings-emnlp.165.pdf
Code: additional community code
Data: CommonGen, ConceptNet, HellaSwag, SWAG

PDF Cite Search Code