Align and Augment: Generative Data Augmentation for Compositional Generalization

Francesco Cazzaro, Davide Locatelli, Ariadna Quattoni


Abstract
Recent work on semantic parsing has shown that seq2seq models find compositional generalization challenging. Several strategies have been proposed to mitigate this challenge. One such strategy is to improve compositional generalization via data augmentation techniques. In this paper we follow this line of work and propose Archer, a data-augmentation strategy that exploits alignment annotations between sentences and their corresponding meaning representations. More precisely, we use alignments to train a two step generative model that combines monotonic lexical generation with reordering. Our experiments show that Archer leads to significant improvements in compositional generalization performance.
Anthology ID:
2024.eacl-long.22
Volume:
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
March
Year:
2024
Address:
St. Julian’s, Malta
Editors:
Yvette Graham, Matthew Purver
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
369–383
Language:
URL:
https://aclanthology.org/2024.eacl-long.22
DOI:
Bibkey:
Cite (ACL):
Francesco Cazzaro, Davide Locatelli, and Ariadna Quattoni. 2024. Align and Augment: Generative Data Augmentation for Compositional Generalization. In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), pages 369–383, St. Julian’s, Malta. Association for Computational Linguistics.
Cite (Informal):
Align and Augment: Generative Data Augmentation for Compositional Generalization (Cazzaro et al., EACL 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2024.eacl-long.22.pdf
Video:
 https://preview.aclanthology.org/dois-2013-emnlp/2024.eacl-long.22.mp4