Mai Phan Quoc Hung


2026

The Subregular Hypothesis posits that phonological patterns in natural languages occupy a restricted region of the formal language hierarchy, yet the cognitive basis for this restriction remains unclear. We propose an information-theoretic characterization: Strictly Local languages, when formalized as shifts of finite type, are exactly those admitting stationary Markov sources, which exhibit zero conditional mutual information between distant positions given intervening symbols. We prove that certain non-subregular patterns such as first-last assimilation admit no such Markov realization, explaining their unlearnability. Empirical validation on English phonotactics versus Finnish, Turkish, and Hungarian vowel harmony confirms that MI profiles statistically distinguish SL-like from TSL-like patterns (p < 0.001, r = 0.84). This work bridges formal language theory and information theory, offering a unified framework for understanding computational restrictions on natural language phonology.