What Matters in Tonotactic Learning

Han Li, Jeffrey Heinz


Abstract
This paper investigates whether tonotactic learning differs across representations and learning models. We conduct an experiment using the same dataset encoded in three representations: segments, features, and autosegmental representations (ARs). To the extent possible, two learning models are evaluated, the Maximum Entropy (MaxEnt) model and the Bottom-Up Factor Inference Algorithm (BUFIA), to examine how learning outcomes interact with both model type and representations. A follow-up experiment further explores the roles of frequency and complexity thresholds. The results show that (1) AR-based learning gives the strongest overall performance; (2) there is no consistent advantage between segmental and featural representations across learning models; (3) MaxEnt performance improves substantially when frequency information is introduced and lastly (4) the effects of complexity bounds interact with representation type and frequency information. These findings suggest that tonotactic learning benefits from structurally explicit representations. Overall this work highlights the importance of using linguistically meaningful representations into learning.
Anthology ID:
2026.scil-main.18
Volume:
Proceedings of the Society for Computation in Linguistics 2026
Month:
July
Year:
2026
Address:
San Diego, CA
Editors:
Rob Voigt, Alex Warstadt, Naomi Feldman, Tal Linzen
Venues:
SCiL | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
180–190
Language:
URL:
https://preview.aclanthology.org/ingest-acl-workshops/2026.scil-main.18/
DOI:
Bibkey:
Cite (ACL):
Han Li and Jeffrey Heinz. 2026. What Matters in Tonotactic Learning. In Proceedings of the Society for Computation in Linguistics 2026, pages 180–190, San Diego, CA. Association for Computational Linguistics.
Cite (Informal):
What Matters in Tonotactic Learning (Li & Heinz, SCiL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl-workshops/2026.scil-main.18.pdf