Charles Yang

Also published as: Charles D. Yang


Learning Morphological Productivity as Meaning-Form Mappings
Sarah Payne | Jordan Kodner | Charles Yang
Proceedings of the Society for Computation in Linguistics 2021

Apparent Communicative Efficiency in the Lexicon is Emergent
Spencer Caplan | Jordan Kodner | Charles Yang
Proceedings of the Society for Computation in Linguistics 2021


Modeling Morphological Typology for Unsupervised Learning of Language Morphology
Hongzhi Xu | Jordan Kodner | Mitchell Marcus | Charles Yang
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

This paper describes a language-independent model for fully unsupervised morphological analysis that exploits a universal framework leveraging morphological typology. By modeling morphological processes including suffixation, prefixation, infixation, and full and partial reduplication with constrained stem change rules, our system effectively constrains the search space and offers a wide coverage in terms of morphological typology. The system is tested on nine typologically and genetically diverse languages, and shows superior performance over leading systems. We also investigate the effect of an oracle that provides only a handful of bits per language to signal morphological type.


Modeling Hierarchical Syntactic Structures in Morphological Processing
Yohei Oseki | Charles Yang | Alec Marantz
Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics

Sentences are represented as hierarchical syntactic structures, which have been successfully modeled in sentence processing. In contrast, despite the theoretical agreement on hierarchical syntactic structures within words, words have been argued to be computationally less complex than sentences and implemented by finite-state models as linear strings of morphemes, and even the psychological reality of morphemes has been denied. In this paper, extending the computational models employed in sentence processing to morphological processing, we performed a computational simulation experiment where, given incremental surprisal as a linking hypothesis, five computational models with different representational assumptions were evaluated against human reaction times in visual lexical decision experiments available from the English Lexicon Project (ELP), a “shared task” in the morphological processing literature. The simulation experiment demonstrated that (i) “amorphous” models without morpheme units underperformed relative to “morphous” models, (ii) a computational model with hierarchical syntactic structures, Probabilistic Context-Free Grammar (PCFG), most accurately explained human reaction times, and (iii) this performance was achieved on top of surface frequency effects. These results strongly suggest that morphological processing tracks morphemes incrementally from left to right and parses them into hierarchical syntactic structures, contrary to “amorphous” and finite-state models of morphological processing.


Unsupervised Morphology Learning with Statistical Paradigms
Hongzhi Xu | Mitchell Marcus | Charles Yang | Lyle Ungar
Proceedings of the 27th International Conference on Computational Linguistics

This paper describes an unsupervised model for morphological segmentation that exploits the notion of paradigms, which are sets of morphological categories (e.g., suffixes) that can be applied to a homogeneous set of words (e.g., nouns or verbs). Our algorithm identifies statistically reliable paradigms from the morphological segmentation result of a probabilistic model, and chooses reliable suffixes from them. The new suffixes can be fed back iteratively to improve the accuracy of the probabilistic model. Finally, the unreliable paradigms are subjected to pruning to eliminate unreliable morphological relations between words. The paradigm-based algorithm significantly improves segmentation accuracy. Our method achieves start-of-the-art results on experiments using the Morpho-Challenge data, including English, Turkish, and Finnish.


Case Studies in the Automatic Characterization of Grammars from Small Wordlists
Jordan Kodner | Spencer Caplan | Hongzhi Xu | Mitchell P. Marcus | Charles Yang
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages


A Statistical Test for Grammar
Charles Yang
Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics


Recession Segmentation: Simpler Online Word Segmentation Using Limited Resources
Constantine Lignos | Charles Yang
Proceedings of the Fourteenth Conference on Computational Natural Language Learning


Statistics Learning and Universal Grammar: Modeling Word Segmentation
Timothy Gambell | Charles Yang
Proceedings of the Workshop on Psycho-Computational Models of Human Language Acquisition


A Selectionist Theory of Language Acquisition
Charles D. Yang
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics


Principle-based Parsing for Chinese
Charles D. Yang | Robert C. Berwick
Proceedings of the 11th Pacific Asia Conference on Language, Information and Computation