Can Sequence-to-Sequence Transformers Naturally Understand Sequential Instructions?

Xiang Zhou, Aditya Gupta, Shyam Upadhyay, Mohit Bansal, Manaal Faruqui


Abstract
While many real-life tasks require reasoning over multi-step sequential instructions, collecting fine-grained annotations for each intermediate step can be prohibitively expensive. In this work, we study how general pretrained sequence-to-sequence transformers perform under varying types of annotation for sequential instruction understanding. We conduct experiments using T5 (Raffel et al., 2020) on a commonly-used multi-step instruction understanding dataset SCONE (Long et al., 2016) that includes three sub-tasks. First, we show that with only gold supervision for the final step of a multi-step instruction sequence, depending on the sequential properties of different tasks, transformers may exhibit extremely bad performance on intermediate steps, in stark contrast with their performance on the final step. Next, we explore two directions to relieve this problem. We show that with the same limited annotation budget, using supervision uniformly distributed across different steps (instead of only final-step supervision), we can greatly improve the performance on intermediate steps with a drop in final-step performance. Further, we explore a contrastive learning approach to provide training signals on intermediate steps with zero intermediate gold supervision. This, however, achieves mixed results. It significantly improves the model’s bad intermediate-step performance on one subtask, but also shows decreased performance on another subtask.
Anthology ID:
2023.starsem-1.45
Volume:
Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Alexis Palmer, Jose Camacho-collados
Venue:
*SEM
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
527–534
Language:
URL:
https://aclanthology.org/2023.starsem-1.45
DOI:
10.18653/v1/2023.starsem-1.45
Bibkey:
Cite (ACL):
Xiang Zhou, Aditya Gupta, Shyam Upadhyay, Mohit Bansal, and Manaal Faruqui. 2023. Can Sequence-to-Sequence Transformers Naturally Understand Sequential Instructions?. In Proceedings of the 12th Joint Conference on Lexical and Computational Semantics (*SEM 2023), pages 527–534, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Can Sequence-to-Sequence Transformers Naturally Understand Sequential Instructions? (Zhou et al., *SEM 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2023.starsem-1.45.pdf