David Chiang


2022

pdf
Overcoming a Theoretical Limitation of Self-Attention
David Chiang | Peter Cholak
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Although transformers are remarkably effective for many tasks, there are some surprisingly easy-looking regular languages that they struggle with. Hahn shows that for languages where acceptance depends on a single input symbol, a transformer’s classification decisions get closer and closer to random guessing (that is, a cross-entropy of 1) as input strings get longer and longer. We examine this limitation using two languages: PARITY, the language of bit strings with an odd number of 1s, and FIRST, the language of bit strings starting with a 1. We demonstrate three ways of overcoming the limitation implied by Hahn’s lemma. First, we settle an open question by constructing a transformer that recognizes PARITY with perfect accuracy, and similarly for FIRST. Second, we use layer normalization to bring the cross-entropy of both models arbitrarily close to zero. Third, when transformers need to focus on a single position, as for FIRST, we find that they can fail to generalize to longer strings; we offer a simple remedy to this problem that also improves length generalization in machine translation.

2021

pdf
Syntax-Based Attention Masking for Neural Machine Translation
Colin McDonald | David Chiang
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop

We present a simple method for extending transformers to source-side trees. We define a number of masks that limit self-attention based on relationships among tree nodes, and we allow each attention head to learn which mask or masks to use. On translation from English to various low-resource languages, and translation in both directions between English and German, our method always improves over simple linearization of the source-side parse tree and almost always improves over a sequence-to-sequence baseline, by up to +2.1 BLEU.

pdf
Data Augmentation by Concatenation for Low-Resource Translation: A Mystery and a Solution
Toan Q. Nguyen | Kenton Murray | David Chiang
Proceedings of the 18th International Conference on Spoken Language Translation (IWSLT 2021)

In this paper, we investigate the driving factors behind concatenation, a simple but effective data augmentation method for low-resource neural machine translation. Our experiments suggest that discourse context is unlikely the cause for concatenation improving BLEU by about +1 across four language pairs. Instead, we demonstrate that the improvement comes from three other factors unrelated to discourse: context diversity, length diversity, and (to a lesser extent) position shifting.

pdf
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Tutorial Abstracts
David Chiang | Min Zhang
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: Tutorial Abstracts

2020

pdf
Learning Context-free Languages with Nondeterministic Stack RNNs
Brian DuSell | David Chiang
Proceedings of the 24th Conference on Computational Natural Language Learning

We present a differentiable stack data structure that simultaneously and tractably encodes an exponential number of stack configurations, based on Lang’s algorithm for simulating nondeterministic pushdown automata. We call the combination of this data structure with a recurrent neural network (RNN) controller a Nondeterministic Stack RNN. We compare our model against existing stack RNNs on various formal languages, demonstrating that our model converges more reliably to algorithmic behavior on deterministic tasks, and achieves lower cross-entropy on inherently nondeterministic tasks.

2019

pdf
Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation
Kenton Murray | Jeffery Kinnison | Toan Q. Nguyen | Walter Scheirer | David Chiang
Proceedings of the 3rd Workshop on Neural Generation and Translation

Neural sequence-to-sequence models, particularly the Transformer, are the state of the art in machine translation. Yet these neural networks are very sensitive to architecture and hyperparameter settings. Optimizing these settings by grid or random search is computationally expensive because it requires many training runs. In this paper, we incorporate architecture search into a single training run through auto-sizing, which uses regularization to delete neurons in a network over the course of training. On very low-resource language pairs, we show that auto-sizing can improve BLEU scores by up to 3.9 points while removing one-third of the parameters from the model.

pdf
Efficiency through Auto-Sizing: Notre Dame NLP’s Submission to the WNGT 2019 Efficiency Task
Kenton Murray | Brian DuSell | David Chiang
Proceedings of the 3rd Workshop on Neural Generation and Translation

This paper describes the Notre Dame Natural Language Processing Group’s (NDNLP) submission to the WNGT 2019 shared task (Hayashi et al., 2019). We investigated the impact of auto-sizing (Murray and Chiang, 2015; Murray et al., 2019) to the Transformer network (Vaswani et al., 2017) with the goal of substantially reducing the number of parameters in the model. Our method was able to eliminate more than 25% of the model’s parameters while suffering a decrease of only 1.1 BLEU.

pdf
Accelerating Sparse Matrix Operations in Neural Networks on Graphics Processing Units
Arturo Argueta | David Chiang
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Graphics Processing Units (GPUs) are commonly used to train and evaluate neural networks efficiently. While previous work in deep learning has focused on accelerating operations on dense matrices/tensors on GPUs, efforts have concentrated on operations involving sparse data structures. Operations using sparse structures are common in natural language models at the input and output layers, because these models operate on sequences over discrete alphabets. We present two new GPU algorithms: one at the input layer, for multiplying a matrix by a few-hot vector (generalizing the more common operation of multiplication by a one-hot vector) and one at the output layer, for a fused softmax and top-N selection (commonly used in beam search). Our methods achieve speedups over state-of-the-art parallel GPU baselines of up to 7x and 50x, respectively. We also illustrate how our methods scale on different GPU architectures.

pdf
Neural Machine Translation of Text from Non-Native Speakers
Antonios Anastasopoulos | Alison Lui | Toan Q. Nguyen | David Chiang
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Neural Machine Translation (NMT) systems are known to degrade when confronted with noisy data, especially when the system is trained only on clean data. In this paper, we show that augmenting training data with sentences containing artificially-introduced grammatical errors can make the system more robust to such errors. In combination with an automatic grammar error correction system, we can recover 1.0 BLEU out of 2.4 BLEU lost due to grammatical errors. We also present a set of Spanish translations of the JFLEG grammar error correction corpus, which allows for testing NMT robustness to real grammatical errors.

2018

pdf
Weighted DAG Automata for Semantic Graphs
David Chiang | Frank Drewes | Daniel Gildea | Adam Lopez | Giorgio Satta
Computational Linguistics, Volume 44, Issue 1 - April 2018

Graphs have a variety of uses in natural language processing, particularly as representations of linguistic meaning. A deficit in this area of research is a formal framework for creating, combining, and using models involving graphs that parallels the frameworks of finite automata for strings and finite tree automata for trees. A possible starting point for such a framework is the formalism of directed acyclic graph (DAG) automata, defined by Kamimura and Slutzki and extended by Quernheim and Knight. In this article, we study the latter in depth, demonstrating several new results, including a practical recognition algorithm that can be used for inference and learning with models defined on DAG automata. We also propose an extension to graphs with unbounded node degree and show that our results carry over to the extended formalism.

pdf
Tied Multitask Learning for Neural Speech Translation
Antonios Anastasopoulos | David Chiang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We explore multitask models for neural translation of speech, augmenting them in order to reflect two intuitive notions. First, we introduce a model where the second task decoder receives information from the decoder of the first task, since higher-level intermediate representations should provide useful information. Second, we apply regularization that encourages transitivity and invertibility. We show that the application of these notions on jointly trained models improves performance on the tasks of low-resource speech transcription and translation. It also leads to better performance when using attention information for word discovery over unsegmented input.

pdf
Improving Lexical Choice in Neural Machine Translation
Toan Nguyen | David Chiang
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

We explore two solutions to the problem of mistranslating rare words in neural machine translation. First, we argue that the standard output layer, which computes the inner product of a vector representing the context with all possible output word embeddings, rewards frequent words disproportionately, and we propose to fix the norms of both vectors to a constant value. Second, we integrate a simple lexical module which is jointly trained with the rest of the model. We evaluate our approaches on eight language pairs with data sizes ranging from 100k to 8M words, and achieve improvements of up to +4.3 BLEU, surpassing phrase-based translation in nearly all settings.

pdf
Combining Character and Word Information in Neural Machine Translation Using a Multi-Level Attention
Huadong Chen | Shujian Huang | David Chiang | Xinyu Dai | Jiajun Chen
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Natural language sentences, being hierarchical, can be represented at different levels of granularity, like words, subwords, or characters. But most neural machine translation systems require the sentence to be represented as a sequence at a single level of granularity. It can be difficult to determine which granularity is better for a particular translation task. In this paper, we improve the model by incorporating multiple levels of granularity. Specifically, we propose (1) an encoder with character attention which augments the (sub)word-level representation with character-level information; (2) a decoder with multiple attentions that enable the representations from different levels of granularity to control the translation cooperatively. Experiments on three translation tasks demonstrate that our proposed models outperform the standard word-based model, the subword-based model, and a strong character-based model.

pdf
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Ellen Riloff | David Chiang | Julia Hockenmaier | Jun’ichi Tsujii
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

pdf
Part-of-Speech Tagging on an Endangered Language: a Parallel Griko-Italian Resource
Antonios Anastasopoulos | Marika Lekakou | Josep Quer | Eleni Zimianiti | Justin DeBenedetto | David Chiang
Proceedings of the 27th International Conference on Computational Linguistics

Most work on part-of-speech (POS) tagging is focused on high resource languages, or examines low-resource and active learning settings through simulated studies. We evaluate POS tagging techniques on an actual endangered language, Griko. We present a resource that contains 114 narratives in Griko, along with sentence-level translations in Italian, and provides gold annotations for the test set. Based on a previously collected small corpus, we investigate several traditional methods, as well as methods that take advantage of monolingual data or project cross-lingual POS tags. We show that the combination of a semi-supervised method with cross-lingual transfer is more appropriate for this extremely challenging setting, with the best tagger achieving an accuracy of 72.9%. With an applied active learning scheme, which we use to collect sentence-level annotations over the test set, we achieve improvements of more than 21 percentage points.

pdf
Correcting Length Bias in Neural Machine Translation
Kenton Murray | David Chiang
Proceedings of the Third Conference on Machine Translation: Research Papers

We study two problems in neural machine translation (NMT). First, in beam search, whereas a wider beam should in principle help translation, it often hurts NMT. Second, NMT has a tendency to produce translations that are too short. Here, we argue that these problems are closely related and both rooted in label bias. We show that correcting the brevity problem almost eliminates the beam problem; we compare some commonly-used methods for doing this, finding that a simple per-word reward works well; and we introduce a simple and quick way to tune this reward using the perceptron algorithm.

pdf
Composing Finite State Transducers on GPUs
Arturo Argueta | David Chiang
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Weighted finite state transducers (FSTs) are frequently used in language processing to handle tasks such as part-of-speech tagging and speech recognition. There has been previous work using multiple CPU cores to accelerate finite state algorithms, but limited attention has been given to parallel graphics processing unit (GPU) implementations. In this paper, we introduce the first (to our knowledge) GPU implementation of the FST composition operation, and we also discuss the optimizations used to achieve the best performance on this architecture. We show that our approach obtains speedups of up to 6 times over our serial implementation and 4.5 times over OpenFST.

2017

pdf
Improved Neural Machine Translation with a Syntax-Aware Encoder and Decoder
Huadong Chen | Shujian Huang | David Chiang | Jiajun Chen
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Most neural machine translation (NMT) models are based on the sequential encoder-decoder framework, which makes no use of syntactic information. In this paper, we improve this model by explicitly incorporating source-side syntactic trees. More specifically, we propose (1) a bidirectional tree encoder which learns both sequential and tree structured representations; (2) a tree-coverage model that lets the attention depend on the source-side syntax. Experiments on Chinese-English translation demonstrate that our proposed models outperform the sequential attentional model as well as a stronger baseline with a bottom-up tree encoder and word coverage.

pdf
Top-Rank Enhanced Listwise Optimization for Statistical Machine Translation
Huadong Chen | Shujian Huang | David Chiang | Xinyu Dai | Jiajun Chen
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

Pairwise ranking methods are the most widely used discriminative training approaches for structure prediction problems in natural language processing (NLP). Decomposing the problem of ranking hypotheses into pairwise comparisons enables simple and efficient solutions. However, neglecting the global ordering of the hypothesis list may hinder learning. We propose a listwise learning framework for structure prediction problems such as machine translation. Our framework directly models the entire translation list’s ordering to learn parameters which may better fit the given listwise samples. Furthermore, we propose top-rank enhanced loss functions, which are more sensitive to ranking errors at higher positions. Experiments on a large-scale Chinese-English translation task show that both our listwise learning framework and top-rank enhanced listwise losses lead to significant improvements in translation quality.

pdf
Decoding with Finite-State Transducers on GPUs
Arturo Argueta | David Chiang
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Weighted finite automata and transducers (including hidden Markov models and conditional random fields) are widely used in natural language processing (NLP) to perform tasks such as morphological analysis, part-of-speech tagging, chunking, named entity recognition, speech recognition, and others. Parallelizing finite state algorithms on graphics processing units (GPUs) would benefit many areas of NLP. Although researchers have implemented GPU versions of basic graph algorithms, no work, to our knowledge, has been done on GPU algorithms for weighted finite automata. We introduce a GPU implementation of the Viterbi and forward-backward algorithm, achieving speedups of up to 4x over our serial implementations running on different computer architectures and 3335x over widely used tools such as OpenFST.

pdf
Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation
Toan Q. Nguyen | David Chiang
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

We present a simple method to improve neural translation of a low-resource language pair using parallel data from a related, also low-resource, language pair. The method is based on the transfer method of Zoph et al., but whereas their method ignores any source vocabulary overlap, ours exploits it. First, we split words using Byte Pair Encoding (BPE) to increase vocabulary overlap. Then, we train a model on the first language pair and transfer its parameters, including its source word embeddings, to another model and continue training on the second language pair. Our experiments show that transfer learning helps word-based translation only slightly, but when used on top of a much stronger BPE baseline, it yields larger improvements of up to 4.3 BLEU.

pdf
A case study on using speech-to-translation alignments for language documentation
Antonios Anastasopoulos | David Chiang
Proceedings of the 2nd Workshop on the Use of Computational Methods in the Study of Endangered Languages

pdf
Spoken Term Discovery for Language Documentation using Translations
Antonios Anastasopoulos | Sameer Bansal | David Chiang | Sharon Goldwater | Adam Lopez
Proceedings of the Workshop on Speech-Centric Natural Language Processing

Vast amounts of speech data collected for language documentation and research remain untranscribed and unsearchable, but often a small amount of speech may have text translations available. We present a method for partially labeling additional speech with translations in this scenario. We modify an unsupervised speech-to-translation alignment model and obtain prototype speech segments that match the translation words, which are in turn used to discover terms in the unlabelled data. We evaluate our method on a Spanish-English speech translation corpus and on two corpora of endangered languages, Arapaho and Ainu, demonstrating its appropriateness and applicability in an actual very-low-resource scenario.

2016

pdf
Proceedings of the 12th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+12)
David Chiang | Alexander Koller
Proceedings of the 12th International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+12)

pdf
An Attentional Model for Speech Translation Without Transcription
Long Duong | Antonios Anastasopoulos | David Chiang | Steven Bird | Trevor Cohn
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages
Antonios Anastasopoulos | David Chiang | Long Duong
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

2015

pdf
Auto-Sizing Neural Networks: With Applications to n-gram Language Models
Kenton Murray | David Chiang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Supervised Phrase Table Triangulation with Neural Word Embeddings for Low-Resource Languages
Tomer Levinboim | David Chiang
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

pdf
Model Invertibility Regularization: Sequence Alignment With or Without Parallel Data
Tomer Levinboim | Ashish Vaswani | David Chiang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf
Multi-Task Word Alignment Triangulation for Low-Resource Languages
Tomer Levinboim | David Chiang
Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

2014

pdf
Kneser-Ney Smoothing on Expected Counts
Hui Zhang | David Chiang
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Improving Word Alignment using Word Similarity
Theerawat Songyot | David Chiang
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2013

pdf
Parsing Graphs with Hyperedge Replacement Grammars
David Chiang | Jacob Andreas | Daniel Bauer | Karl Moritz Hermann | Bevan Jones | Kevin Knight
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Decoding with Large-Scale Neural Language Models Improves Translation
Ashish Vaswani | Yinggong Zhao | Victoria Fossum | David Chiang
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf
Machine Translation for Language Preservation
Steven Bird | David Chiang
Proceedings of COLING 2012: Posters

pdf
Smaller Alignment Models for Better Translations: Unsupervised Word Alignment with the l0-norm
Ashish Vaswani | Liang Huang | David Chiang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
An Exploration of Forest-to-String Translation: Does Translation Help or Hurt Parsing?
Hui Zhang | David Chiang
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf
Rule Markov Models for Fast Tree-to-String Translation
Ashish Vaswani | Haitao Mi | Liang Huang | David Chiang
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Language-Independent Parsing with Empty Elements
Shu Cai | David Chiang | Yoav Goldberg
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Models and Training for Unsupervised Preposition Sense Disambiguation
Dirk Hovy | Ashish Vaswani | Stephen Tratz | David Chiang | Eduard Hovy
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
Two Easy Improvements to Lexical Weighting
David Chiang | Steve DeNeefe | Michael Pust
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf
Unsupervised Syntactic Alignment with Inversion Transduction Grammars
Adam Pauls | Dan Klein | David Chiang | Kevin Knight
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Bayesian Inference for Finite-State Transducers
David Chiang | Jonathan Graehl | Kevin Knight | Adam Pauls | Sujith Ravi
Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Fast, Greedy Model Minimization for Unsupervised Tagging
Sujith Ravi | Ashish Vaswani | Kevin Knight | David Chiang
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

pdf
Learning to Translate with Source and Target Syntax
David Chiang
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Efficient Optimization of an MDL-Inspired Objective Function for Unsupervised Part-Of-Speech Tagging
Ashish Vaswani | Adam Pauls | David Chiang
Proceedings of the ACL 2010 Conference Short Papers

2009

pdf
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009
Dekai Wu | David Chiang
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf
11,001 New Features for Statistical Machine Translation
David Chiang | Kevin Knight | Wei Wang
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics

pdf
Fast Consensus Decoding over Translation Forests
John DeNero | David Chiang | Kevin Knight
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

2008

pdf
Online Large-Margin Training of Syntactic and Structural Translation Features
David Chiang | Yuval Marton | Philip Resnik
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf
Decomposability of Translation Metrics for Improved Evaluation and Efficient Algorithms
David Chiang | Steve DeNeefe | Yee Seng Chan | Hwee Tou Ng
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)
David Chiang | Dekai Wu
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)

pdf
Flexible Composition and Delayed Tree-Locality
David Chiang | Tatjana Scheffler
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+9)

pdf
Extracting Synchronous Grammar Rules From Word-Level Alignments in Linear Time
Hao Zhang | Daniel Gildea | David Chiang
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

2007

pdf
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation
Dekai Wu | David Chiang
Proceedings of SSST, NAACL-HLT 2007 / AMTA Workshop on Syntax and Structure in Statistical Translation

pdf
Hierarchical Phrase-Based Translation
David Chiang
Computational Linguistics, Volume 33, Number 2, June 2007

pdf
Word Sense Disambiguation Improves Statistical Machine Translation
Yee Seng Chan | Hwee Tou Ng | David Chiang
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf
Forest Rescoring: Faster Decoding with Integrated Language Models
Liang Huang | David Chiang
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

2006

pdf
Parsing Arabic Dialects
David Chiang | Mona Diab | Nizar Habash | Owen Rambow | Safiullah Shareef
11th Conference of the European Chapter of the Association for Computational Linguistics

pdf
The Hidden TAG Model: Synchronous Grammars for Parsing Resource-Poor Languages
David Chiang | Owen Rambow
Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms

pdf
The Weak Generative Capacity of Linear Tree-Adjoining Grammars
David Chiang
Proceedings of the Eighth International Workshop on Tree Adjoining Grammar and Related Formalisms

2005

pdf
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis
David Chiang | Adam Lopez | Nitin Madnani | Christof Monz | Philip Resnik | Michael Subotin
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

pdf
A Hierarchical Phrase-Based Model for Statistical Machine Translation
David Chiang
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

pdf
Better k-best Parsing
Liang Huang | David Chiang
Proceedings of the Ninth International Workshop on Parsing Technology

2004

pdf
Uses and abuses of intersected languages
David Chiang
Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms

2002

pdf
Recovering Latent Information in Treebanks
David Chiang | Daniel M. Bikel
COLING 2002: The 19th International Conference on Computational Linguistics

pdf
Putting Some Weakly Context-Free Formalisms in Order
David Chiang
Proceedings of the Sixth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+6)

2001

pdf
Facilitating Treebank Annotation Using a Statistical Parser
Fu-Dong Chiou | David Chiang | Martha Palmer
Proceedings of the First International Conference on Human Language Technology Research

pdf
Constraints on Strong Generative Power
David Chiang
Proceedings of the 39th Annual Meeting of the Association for Computational Linguistics

2000

pdf
Two Statistical Parsing Models Applied to the Chinese Treebank
Daniel M. Bikel | David Chiang
Second Chinese Language Processing Workshop

pdf
Some remarks on an extension of synchronous TAG
David Chiang | William Schuler | Mark Dras
Proceedings of the Fifth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+5)

pdf
Multi-Component TAG and Notions of Formal Power
William Schuler | David Chiang | Mark Dras
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics

pdf
Statistical Parsing with an Automatically-Extracted Tree Adjoining Grammar
David Chiang
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics