COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

Najoung Kim; Tal Linzen

doi:10.18653/v1/2020.emnlp-main.731

COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

Abstract

Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing architectures, we introduce COGS, a semantic parsing dataset based on a fragment of English. The evaluation portion of COGS contains multiple systematic gaps that can only be addressed by compositional generalization; these include new combinations of familiar syntactic structures, or new combinations of familiar words and familiar structures. In experiments with Transformers and LSTMs, we found that in-distribution accuracy on the COGS test set was near-perfect (96–99%), but generalization accuracy was substantially lower (16–35%) and showed high sensitivity to random seed (+-6–8%). These findings indicate that contemporary standard NLP models are limited in their compositional generalization capacity, and position COGS as a good way to measure progress.

Anthology ID:: 2020.emnlp-main.731
Volume:: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Month:: November
Year:: 2020
Address:: Online
Editors:: Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 9087–9105
Language:
URL:: https://preview.aclanthology.org/add-emnlp-2024-awards/2020.emnlp-main.731/
DOI:: 10.18653/v1/2020.emnlp-main.731
Bibkey:
Cite (ACL):: Najoung Kim and Tal Linzen. 2020. COGS: A Compositional Generalization Challenge Based on Semantic Interpretation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9087–9105, Online. Association for Computational Linguistics.
Cite (Informal):: COGS: A Compositional Generalization Challenge Based on Semantic Interpretation (Kim & Linzen, EMNLP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/add-emnlp-2024-awards/2020.emnlp-main.731.pdf
Video:: https://slideslive.com/38939064
Code: najoungkim/COGS
Data: CFQ, SCAN

PDF Cite Search Code Video Fix data