Connor Mayer
2026
Quantifying mutual intelligibility gradients in Turkic languages using language models
Moldir Baidildinova | Shiva Upadhye | Austin Wagner | Connor Mayer | Richard Futrell
Proceedings of the Society for Computation in Linguistics 2026
Moldir Baidildinova | Shiva Upadhye | Austin Wagner | Connor Mayer | Richard Futrell
Proceedings of the Society for Computation in Linguistics 2026
Mutual intelligibility (MI) among related languages is a gradient phenomenon shaped by lexical, grammatical, and phonetic-phonological similarity. This study proposes a neural language modeling approach to quantifying MI patterns within the Turkic language family. Using IPA-transcribed naturalistic text from six Turkic languages, we train character-level LSTM models on a source language and fine-tune them on target languages that vary in genealogical distance. Cross-lingual transfer is evaluated using character-level cross-entropy (CE) loss, Area Under the Curve (AUC), and Rate of Change (ROC), which together capture model generalization, learning dynamics, and early-stage adaptation. We further examine whether model performance is predicted by cophenetic distance, lexical similarity, weighted trigram frequency overlap, and differences in vowel harmony index. Overall, the results suggest that character-level language models can approximate MI gradients across Turkic languages: closely related pairs generally show lower CE loss and smaller AUC, while more distant pairs show greater early-stage change. Lexical similarity, local phonotactic overlap, and genealogical distance appear to be the most informative predictors of model convergence. These findings provide preliminary evidence that neural language models trained on naturalistic text can offer a scalable way to model MI patterns, including directional asymmetries, across closely related languages.
2025
Reconciling categorical and gradient models of phonotactics
Connor Mayer
Proceedings of the Society for Computation in Linguistics 2025
Connor Mayer
Proceedings of the Society for Computation in Linguistics 2025
2024
Proceedings of the Society for Computation in Linguistics 2024
Richard Futrell | Connor Mayer | Noga Zaslavsky
Proceedings of the Society for Computation in Linguistics 2024
Richard Futrell | Connor Mayer | Noga Zaslavsky
Proceedings of the Society for Computation in Linguistics 2024
2023
Rethinking representations: A log-bilinear model of phonotactics
Huteng Dai | Connor Mayer | Richard Futrell
Proceedings of the Society for Computation in Linguistics 2023
Huteng Dai | Connor Mayer | Richard Futrell
Proceedings of the Society for Computation in Linguistics 2023
Modeling island effects with probabilistic tier-based strictly local grammars over trees
Charles Torres | Kenneth Hanson | Thomas Graf | Connor Mayer
Proceedings of the Society for Computation in Linguistics 2023
Charles Torres | Kenneth Hanson | Thomas Graf | Connor Mayer
Proceedings of the Society for Computation in Linguistics 2023
2021
Capturing gradience in long-distance phonology using probabilistic tier-based strictly local grammars
Connor Mayer
Proceedings of the Society for Computation in Linguistics 2021
Connor Mayer
Proceedings of the Society for Computation in Linguistics 2021
2020
Phonotactic learning with neural language models
Connor Mayer | Max Nelson
Proceedings of the Society for Computation in Linguistics 2020
Connor Mayer | Max Nelson
Proceedings of the Society for Computation in Linguistics 2020
2018
Sanskrit n-Retroflexion is Input-Output Tier-Based Strictly Local
Thomas Graf | Connor Mayer
Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology
Thomas Graf | Connor Mayer
Proceedings of the Fifteenth Workshop on Computational Research in Phonetics, Phonology, and Morphology
Sanskrit /n/-retroflexion is one of the most complex segmental processes in phonology. While it is still star-free, it does not fit in any of the subregular classes that are commonly entertained in the literature. We show that when construed as a phonotactic dependency, the process fits into a class we call input-output tier-based strictly local (IO-TSL), a natural extension of the familiar class TSL. IO-TSL increases the power of TSL’s tier projection function by making it an input-output strictly local transduction. Assuming that /n/-retroflexion represents the upper bound on the complexity of segmental phonology, this shows that all of segmental phonology can be captured by combining the intuitive notion of tiers with the independently motivated machinery of strictly local mappings.