Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing

Jing Zheng; Jyh-Herng Chow; Zhongnan Shen; Peng Xu

doi:10.18653/v1/2023.findings-acl.91

Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing

Jing Zheng, Jyh-Herng Chow, Zhongnan Shen, Peng Xu

Abstract

Sequence-to-sequence (seq2seq) models have achieved great success in semantic parsing tasks, but they tend to struggle on out-of-distribution (OOD) data. Despite recent progress, robust semantic parsing on large-scale tasks with combined challenges from both compositional generalization and natural language variations remains an unsolved problem. To promote research in this area, this work presents CUDON, a large-scale dialogue dataset in Chinese language, particularly designed for evaluating compositional generalization of semantic parsing. The dataset contains about ten thousand multi-turn complex queries, and provides multiple splits with different degrees of train-test distribution divergence. We have investigated improving compositional generalization with grammar-based decodering on this dataset. With specially designed grammars leveraging program schema, we are able to substantially improve accuracy of seq2seq semantic parsers on OOD splits: A LSTM-based parser using a Context-free Grammar (CFG) achieves over 25% higher accuracy than a standard seq2seq baseline; a parser using Tree-Substitution Grammar (TSG) improves parsing speed five to seven times over the CFG parser with only a small accuracy loss. The grammar-based LSTM parsers also outperforms BART- and T5-based seq2seq parsers on the OOD splits, despite having less than one tenth of parameters and no pretraining. We also verified our approach on the SMCalflow-CS dataset, particularly, on the zero-shot learning task.

Anthology ID:: 2023.findings-acl.91
Volume:: Findings of the Association for Computational Linguistics: ACL 2023
Month:: July
Year:: 2023
Address:: Toronto, Canada
Editors:: Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1399–1418
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2023.findings-acl.91/
DOI:: 10.18653/v1/2023.findings-acl.91
Bibkey:
Cite (ACL):: Jing Zheng, Jyh-Herng Chow, Zhongnan Shen, and Peng Xu. 2023. Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1399–1418, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):: Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing (Zheng et al., Findings 2023)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2023.findings-acl.91.pdf

PDF Cite Search Fix data