Alvin Grissom II


Rare but Severe Neural Machine Translation Errors Induced by Minimal Deletion: An Empirical Study on Chinese and English
Ruikang Shi | Alvin Grissom II | Duc Minh Trinh
Proceedings of the 29th International Conference on Computational Linguistics

We examine the inducement of rare but severe errors in English-Chinese and Chinese-English in-domain neural machine translation by minimal deletion of source text with character-based models. By deleting a single character, we can induce severe translation errors. We categorize these errors and compare the results of deleting single characters and single words. We also examine the effect of training data size on the number and types of pathological cases induced by these minimal perturbations, finding significant variation. We find that deleting a word hurts overall translation score more than deleting a character, but certain errors are more likely to occur when deleting characters, with language direction also influencing the effect.


An Attentive Recurrent Model for Incremental Prediction of Sentence-final Verbs
Wenyan Li | Alvin Grissom II | Jordan Boyd-Graber
Findings of the Association for Computational Linguistics: EMNLP 2020

Verb prediction is important for understanding human processing of verb-final languages, with practical applications to real-time simultaneous interpretation from verb-final to verb-medial languages. While previous approaches use classical statistical models, we introduce an attention-based neural model to incrementally predict final verbs on incomplete sentences in Japanese and German SOV sentences. To offer flexibility to the model, we further incorporate synonym awareness. Our approach both better predicts the final verbs in Japanese and German and provides more interpretable explanations of why those verbs are selected.


Assessing the Ability of Neural Machine Translation Models to Perform Syntactic Rewriting
Jahkel Robin | Alvin Grissom II | Matthew Roselli
Proceedings of the 2019 Workshop on Widening NLP

We describe work in progress for evaluating performance of sequence-to-sequence neural networks on the task of syntax-based reordering for rules applicable to simultaneous machine translation. We train models that attempt to rewrite English sentences using rules that are commonly used by human interpreters. We examine the performance of these models to determine which forms of rewriting are more difficult for them to learn and which architectures are the best at learning them.

Investigating Sports Commentator Bias within a Large Corpus of American Football Broadcasts
Jack Merullo | Luke Yeh | Abram Handler | Alvin Grissom II | Brendan O’Connor | Mohit Iyyer
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Sports broadcasters inject drama into play-by-play commentary by building team and player narratives through subjective analyses and anecdotes. Prior studies based on small datasets and manual coding show that such theatrics evince commentator bias in sports broadcasts. To examine this phenomenon, we assemble FOOTBALL, which contains 1,455 broadcast transcripts from American football games across six decades that are automatically annotated with 250K player mentions and linked with racial metadata. We identify major confounding factors for researchers examining racial bias in FOOTBALL, and perform a computational analysis that supports conclusions from prior social science studies.


Pathologies of Neural Models Make Interpretations Difficult
Shi Feng | Eric Wallace | Alvin Grissom II | Mohit Iyyer | Pedro Rodriguez | Jordan Boyd-Graber
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

One way to interpret neural model predictions is to highlight the most important input features—for example, a heatmap visualization over the words in an input sentence. In existing interpretation methods for NLP, a word’s importance is determined by either input perturbation—measuring the decrease in model confidence when that word is removed—or by the gradient with respect to that word. To understand the limitations of these methods, we use input reduction, which iteratively removes the least important word from the input. This exposes pathological behaviors of neural models: the remaining words appear nonsensical to humans and are not the ones determined as important by interpretation methods. As we confirm with human experiments, the reduced examples lack information to support the prediction of any label, but models still make the same predictions with high confidence. To explain these counterintuitive results, we draw connections to adversarial examples and confidence calibration: pathological behaviors reveal difficulties in interpreting neural models trained with maximum likelihood. To mitigate their deficiencies, we fine-tune the models by encouraging high entropy outputs on reduced examples. Fine-tuned models become more interpretable under input reduction, without accuracy loss on regular examples.


Substring Frequency Features for Segmentation of Japanese Katakana Words with Unlabeled Corpora
Yoshinari Fujinuma | Alvin Grissom II
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Word segmentation is crucial in natural language processing tasks for unsegmented languages. In Japanese, many out-of-vocabulary words appear in the phonetic syllabary katakana, making segmentation more difficult due to the lack of clues found in mixed script settings. In this paper, we propose a straightforward approach based on a variant of tf-idf and apply it to the problem of word segmentation in Japanese. Even though our method uses only an unlabeled corpus, experimental results show that it achieves performance comparable to existing methods that use manually labeled corpora. Furthermore, it improves performance of simple word segmentation models trained on a manually labeled corpus.


Incremental Prediction of Sentence-final Verbs: Humans versus Machines
Alvin Grissom II | Naho Orita | Jordan Boyd-Graber
Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning


Syntax-based Rewriting for Simultaneous Machine Translation
He He | Alvin Grissom II | John Morgan | Jordan Boyd-Graber | Hal Daumé III
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing


Don’t Until the Final Verb Wait: Reinforcement Learning for Simultaneous Machine Translation
Alvin Grissom II | He He | Jordan Boyd-Graber | John Morgan | Hal Daumé III
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)


Annotating Factive Verbs
Alvin Grissom II | Yusuke Miyao
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We have created a scheme for annotating corpora designed to capture relevant aspects of factivity in verb-complement constructions. Factivity constructions are a well-known linguistic phenomenon that embed presuppositions about the state of the world into a clause. These embedded presuppositions provide implicit information about facts assumed to be true in the world, and are thus potentially valuable in areas of research such as textual entailment. We attempt to address both clear-cut cases of factivity and non-factivity, as well as account for the fluidity and ambiguous nature of some realizations of this construction. Our extensible scheme is designed to account for distinctions between claims, performatives, atypical uses of factivity, and the authority of the one making the utterance. We introduce a simple XML-based syntax for the annotation of factive verbs and clauses, in order to capture this information. We also provide an analysis of the issues which led to these annotative decisions, in the hope that these analyses will be beneficial to those dealing with factivity in a practical context.