Abstract
The recently introduced path-star task is a minimal task designed to exemplify limitations to the abilities of language models (Bachmann and Nagarajan, 2024). It involves a path-star graph where multiple arms radiate from a single starting node and each node is unique. Given the start node and a specified target node that ends an arm, the task is to generate the arm containing that target node. This is straightforward for a human but surprisingly difficult for language models, which did not outperform the random baseline. The authors hypothesized this is due to a deficiency in teacher-forcing and the next-token prediction paradigm. We demonstrate the task is learnable using teacher-forcing in alternative settings and that the issue is partially due to representation. We introduce a regularization method using structured samples of the same graph but with differing target nodes, improving results across a variety of model types. We provide RASP proofs showing the task is theoretically solvable. Finally, we find settings where an encoder-only model can consistently solve the task.- Anthology ID:
- 2024.emnlp-main.695
- Volume:
- Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
- Month:
- November
- Year:
- 2024
- Address:
- Miami, Florida, USA
- Editors:
- Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 12493–12516
- Language:
- URL:
- https://aclanthology.org/2024.emnlp-main.695
- DOI:
- 10.18653/v1/2024.emnlp-main.695
- Cite (ACL):
- Arvid Frydenlund. 2024. The Mystery of the Pathological Path-star Task for Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 12493–12516, Miami, Florida, USA. Association for Computational Linguistics.
- Cite (Informal):
- The Mystery of the Pathological Path-star Task for Language Models (Frydenlund, EMNLP 2024)
- PDF:
- https://preview.aclanthology.org/dois-2013-emnlp/2024.emnlp-main.695.pdf