Abstract
When mapping a natural language instruction to a sequence of actions, it is often useful toidentify sub-tasks in the instruction. Such sub-task segmentation, however, is not necessarily provided in the training data. We present the A2LCTC (Action-to-Language Connectionist Temporal Classification) algorithm to automatically discover a sub-task segmentation of an action sequence.A2LCTC does not require annotations of correct sub-task segments and learns to find them from pairs of instruction and action sequence in a weakly-supervised manner.We experiment with the ALFRED dataset and show that A2LCTC accurately finds the sub-task structures.With the discovered sub-tasks segments, we also train agents that work on the downstream task and empirically show that our algorithm improves the performance.- Anthology ID:
- 2022.lnls-1.1
- Volume:
- Proceedings of the First Workshop on Learning with Natural Language Supervision
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- LNLS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1–9
- Language:
- URL:
- https://aclanthology.org/2022.lnls-1.1
- DOI:
- 10.18653/v1/2022.lnls-1.1
- Cite (ACL):
- Ryokan Ri, Yufang Hou, Radu Marinescu, and Akihiro Kishimoto. 2022. Finding Sub-task Structure with Natural Language Instruction. In Proceedings of the First Workshop on Learning with Natural Language Supervision, pages 1–9, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Finding Sub-task Structure with Natural Language Instruction (Ri et al., LNLS 2022)
- PDF:
- https://preview.aclanthology.org/auto-file-uploads/2022.lnls-1.1.pdf
- Data
- ALFRED