Abstract
In the Minecraft Collaborative Building Task, two players collaborate: an Architect (A) provides instructions to a Builder (B) to assemble a specified structure using 3D blocks. In this work, we investigate the use of large language models (LLMs) to predict the sequence of actions taken by the Builder. Leveraging LLMs’ in-context learning abilities, we use few-shot prompting techniques, that significantly improve performance over baseline methods. Additionally, we present a detailed analysis of the gaps in performance for future work.- Anthology ID:
- 2024.findings-emnlp.652
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2024
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11159–11170
- Language:
- URL:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-emnlp.652/
- DOI:
- 10.18653/v1/2024.findings-emnlp.652
- Cite (ACL):
- Kranti Ch, Sherzod Hakimov, and David Schlangen. 2024. Retrieval-Augmented Code Generation for Situated Action Generation: A Case Study on Minecraft. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 11159–11170, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- Retrieval-Augmented Code Generation for Situated Action Generation: A Case Study on Minecraft (Ch et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/build-pipeline-with-new-library/2024.findings-emnlp.652.pdf