Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing

Jing Zheng, Jyh-Herng Chow, Zhongnan Shen, Peng Xu


Abstract
Sequence-to-sequence (seq2seq) models have achieved great success in semantic parsing tasks, but they tend to struggle on out-of-distribution (OOD) data. Despite recent progress, robust semantic parsing on large-scale tasks with combined challenges from both compositional generalization and natural language variations remains an unsolved problem. To promote research in this area, this work presents CUDON, a large-scale dialogue dataset in Chinese language, particularly designed for evaluating compositional generalization of semantic parsing. The dataset contains about ten thousand multi-turn complex queries, and provides multiple splits with different degrees of train-test distribution divergence. We have investigated improving compositional generalization with grammar-based decodering on this dataset. With specially designed grammars leveraging program schema, we are able to substantially improve accuracy of seq2seq semantic parsers on OOD splits: A LSTM-based parser using a Context-free Grammar (CFG) achieves over 25% higher accuracy than a standard seq2seq baseline; a parser using Tree-Substitution Grammar (TSG) improves parsing speed five to seven times over the CFG parser with only a small accuracy loss. The grammar-based LSTM parsers also outperforms BART- and T5-based seq2seq parsers on the OOD splits, despite having less than one tenth of parameters and no pretraining. We also verified our approach on the SMCalflow-CS dataset, particularly, on the zero-shot learning task.
Anthology ID:
2023.findings-acl.91
Volume:
Findings of the Association for Computational Linguistics: ACL 2023
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1399–1418
Language:
URL:
https://aclanthology.org/2023.findings-acl.91
DOI:
10.18653/v1/2023.findings-acl.91
Bibkey:
Cite (ACL):
Jing Zheng, Jyh-Herng Chow, Zhongnan Shen, and Peng Xu. 2023. Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing. In Findings of the Association for Computational Linguistics: ACL 2023, pages 1399–1418, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Grammar-based Decoding for Improved Compositional Generalization in Semantic Parsing (Zheng et al., Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/proper-vol2-ingestion/2023.findings-acl.91.pdf