Chu-Cheng Lin


2021

pdf
Limitations of Autoregressive Models and Their Alternatives
Chu-Cheng Lin | Aaron Jaech | Xin Li | Matthew R. Gormley | Jason Eisner
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Standard autoregressive language models perform only polynomial-time computation to compute the probability of the next symbol. While this is attractive, it means they cannot model distributions whose next-symbol probability is hard to compute. Indeed, they cannot even model them well enough to solve associated easy decision problems for which an engineer might want to consult a language model. These limitations apply no matter how much computation and data are used to train the model, unless the model is given access to oracle parameters that grow superpolynomially in sequence length. Thus, simply training larger autoregressive language models is not a panacea for NLP. Alternatives include energy-based models (which give up efficient sampling) and latent-variable autoregressive models (which give up efficient scoring of a given string). Both are powerful enough to escape the above limitations.

2019

pdf
Neural Finite-State Transducers: Beyond Rational Relations
Chu-Cheng Lin | Hao Zhu | Matthew R. Gormley | Jason Eisner
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

We introduce neural finite state transducers (NFSTs), a family of string transduction models defining joint and conditional probability distributions over pairs of strings. The probability of a string pair is obtained by marginalizing over all its accepting paths in a finite state transducer. In contrast to ordinary weighted FSTs, however, each path is scored using an arbitrary function such as a recurrent neural network, which breaks the usual conditional independence assumption (Markov property). NFSTs are more powerful than previous finite-state models with neural features (Rastogi et al., 2016.) We present training and inference algorithms for locally and globally normalized variants of NFSTs. In experiments on different transduction tasks, they compete favorably against seq2seq models while offering interpretable paths that correspond to hard monotonic alignments.

2018

pdf
Neural Particle Smoothing for Sampling from Conditional Sequence Models
Chu-Cheng Lin | Jason Eisner
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We introduce neural particle smoothing, a sequential Monte Carlo method for sampling annotations of an input string from a given probability model. In contrast to conventional particle filtering algorithms, we train a proposal distribution that looks ahead to the end of the input string by means of a right-to-left LSTM. We demonstrate that this innovation can improve the quality of the sample. To motivate our formal choices, we explain how neural transduction models and our sampler can be viewed as low-dimensional but nonlinear approximations to working with HMMs over very large state spaces.

2015

pdf
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention
Wang Ling | Yulia Tsvetkov | Silvio Amir | Ramón Fermandez | Chris Dyer | Alan W Black | Isabel Trancoso | Chu-Cheng Lin
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Unsupervised POS Induction with Word Embeddings
Chu-Cheng Lin | Waleed Ammar | Chris Dyer | Lori Levin
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
The CMU Submission for the Shared Task on Language Identification in Code-Switched Data
Chu-Cheng Lin | Waleed Ammar | Lori Levin | Chris Dyer
Proceedings of the First Workshop on Computational Approaches to Code Switching

pdf
Automatic Classification of Communicative Functions of Definiteness
Archna Bhatia | Chu-Cheng Lin | Nathan Schneider | Yulia Tsvetkov | Fatima Talib Al-Raisi | Laleh Roostapour | Jordan Bender | Abhimanu Kumar | Lori Levin | Mandy Simons | Chris Dyer
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2009

pdf
Modeling the Relationship among Linguistic Typological Features with Hierarchical Dirichlet Process
Chu-Cheng Lin | Yu-Chun Wang | Richard Tzong-Han Tsai
Proceedings of the 23rd Pacific Asia Conference on Language, Information and Computation, Volume 2

2007

pdf
Korean-Chinese Person Name Translation for Cross Language Information Retrieval
Yu-Chun Wang | Yi-Hsun Lee | Chu-Cheng Lin | Tzong-Han Richard Tsai | Wen-Lian Hsu
Proceedings of the 21st Pacific Asia Conference on Language, Information and Computation