On the Learnability of Syntax from Raw Speech with Autoregressive Predictive Coding

Shunsuke Kando, Yusuke Miyao


Abstract
Children are known to generalize syntactic knowledge at ages when their linguistic input is predominantly raw speech rather than text. This raises the question of whether syntactic generalization can emerge directly from acoustic input. We address this question using Autoregressive Predictive Coding (APC), a simple prediction-based self-supervised speech model. To approximate the input available to human learners while enabling controlled comparison, we train models on both child-directed speech and audiobook speech. We evaluate the models on a minimal-pair benchmark targeting elementary syntactic phenomena, designed to be acquisition-friendly. Our results show that APC partially generalizes word-order regularities when trained to predict near-future frames. However, the model fails to generalize agreement phenomena, suggesting that predictive learning from acoustic signals alone is insufficient. Furthermore, we observe distinct learning dynamics across word-order phenomena, suggesting that some improvements may be driven by shallow statistical regularities rather than genuine syntactic generalization.
Anthology ID:
2026.cdl-1.11
Volume:
Proceedings of the 1st Workshop on Computational Developmental Linguistics (CDL)
Month:
July
Year:
2026
Address:
Grand Hyatt Manchester San Diego, 1 Market Pl, San Diego, CA 92101
Editors:
Martin Ziqiao Ma, Emmy Liu, Jing Liu, Tyler A. Chang, Abdellah Fourtassi, Alex Warstadt, Michael Hahn, Weiwei Sun, Freda Shi
Venues:
CDL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
77–82
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.cdl-1.11/
DOI:
Bibkey:
Cite (ACL):
Shunsuke Kando and Yusuke Miyao. 2026. On the Learnability of Syntax from Raw Speech with Autoregressive Predictive Coding. In Proceedings of the 1st Workshop on Computational Developmental Linguistics (CDL), pages 77–82, Grand Hyatt Manchester San Diego, 1 Market Pl, San Diego, CA 92101. Association for Computational Linguistics.
Cite (Informal):
On the Learnability of Syntax from Raw Speech with Autoregressive Predictive Coding (Kando & Miyao, CDL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.cdl-1.11.pdf