Kenneth Church

Also published as: Ken Church, Kenneth W. Church, Kenneth Ward Church


2023

pdf
A Research-Based Guide for the Creation and Deployment of a Low-Resource Machine Translation System
John E. Ortega | Kenneth Church
Proceedings of the 14th International Conference on Recent Advances in Natural Language Processing

The machine translation (MT) field seems to focus heavily on English and other high-resource languages. Though, low-resource MT (LRMT) is receiving more attention than in the past. Successful LRMT systems (LRMTS) should make a compelling business case in terms of demand, cost and quality in order to be viable for end users. When used by communities where low-resource languages are spoken, LRMT quality should not only be determined by the use of traditional metrics like BLEU, but it should also take into account other factors in order to be inclusive and not risk overall rejection by the community. MT systems based on neural methods tend to perform better with high volumes of training data, but they may be unrealistic and even harmful for LRMT. It is obvious that for research purposes, the development and creation of LRMTS is necessary. However, in this article, we argue that two main workarounds could be considered by companies that are considering deployment of LRMTS in the wild: human-in-the-loop and sub-domains.

2022

pdf bib
A Gentle Introduction to Deep Nets and Opportunities for the Future
Kenneth Church | Valia Kordoni | Gary Marcus | Ernest Davis | Yanjun Ma | Zeyu Chen
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Tutorial Abstracts

The first half of this tutorial will make deep nets more accessible to a broader audience, following “Deep Nets for Poets” and “A Gentle Introduction to Fine-Tuning.” We will also introduce GFT (general fine tuning), a little language for fine tuning deep nets with short (one line) programs that are as easy to code as regression in statistics packages such as R using glm (general linear models). Based on the success of these methods on a number of benchmarks, one might come away with the impression that deep nets are all we need. However, we believe the glass is half-full: while there is much that can be done with deep nets, there is always more to do. The second half of this tutorial will discuss some of these opportunities.

pdf
Training on Lexical Resources
Kenneth Church | Xingyu Cai | Yuchen Bian
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We propose using lexical resources (thesaurus, VAD) to fine-tune pretrained deep nets such as BERT and ERNIE. Then at inference time, these nets can be used to distinguish synonyms from antonyms, as well as VAD distances. The inference method can be applied to words as well as texts such as multiword expressions (MWEs), out of vocabulary words (OOVs), morphological variants and more. Code and data are posted on https://github.com/kwchurch/syn_ant.

pdf
Data Augmentation for the Post-Stroke Speech Transcription (PSST) Challenge: Sometimes Less Is More
Jiahong Yuan | Xingyu Cai | Kenneth Church
Proceedings of the RaPID Workshop - Resources and ProcessIng of linguistic, para-linguistic and extra-linguistic Data from people with various forms of cognitive/psychiatric/developmental impairments - within the 13th Language Resources and Evaluation Conference

We employ the method of fine-tuning wav2vec2.0 for recognition of phonemes in aphasic speech. Our effort focuses on data augmentation, by supplementing data from both in-domain and out-of-domain datasets for training. We found that although a modest amount of out-of-domain data may be helpful, the performance of the model degrades significantly when the amount of out-of-domain data is much larger than in-domain data. Our hypothesis is that fine-tuning wav2vec2.0 with a CTC loss not only learns bottom-up acoustic properties but also top-down constraints. Therefore, out-of-domain data augmentation is likely to degrade performance if there is a language model mismatch between “in” and “out” domains. For in-domain audio without ground truth labels, we found that it is beneficial to exclude samples with less confident pseudo labels. Our final model achieves 16.7% PER (phoneme error rate) on the validation set, without using a language model for decoding. The result represents a relative error reduction of 14% over the baseline model trained without data augmentation. Finally, we found that “canonicalized” phonemes are much easier to recognize than manually transcribed phonemes.

pdf
ArtELingo: A Million Emotion Annotations of WikiArt with Emphasis on Diversity over Language and Culture
Youssef Mohamed | Mohamed Abdelfattah | Shyma Alhuwaider | Feifan Li | Xiangliang Zhang | Kenneth Church | Mohamed Elhoseiny
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

This paper introduces ArtELingo, a new benchmark and dataset, designed to encourage work on diversity across languages and cultures. Following ArtEmis, a collection of 80k artworks from WikiArt with 0.45M emotion labels and English-only captions, ArtELingo adds another 0.79M annotations in Arabic and Chinese, plus 4.8K in Spanish to evaluate “cultural-transfer” performance. 51K artworks have 5 annotations or more in 3 languages. This diversity makes it possible to study similarities and differences across languages and cultures. Further, we investigate captioning tasks, and find diversity improves the performance of baseline models. ArtELingo is publicly available at ‘www.artelingo.org‘ with standard splits and baseline models. We hope our work will help ease future research on multilinguality and culturally-aware AI.

2021

pdf
On Attention Redundancy: A Comprehensive Study
Yuchen Bian | Jiaji Huang | Xingyu Cai | Jiahong Yuan | Kenneth Church
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Multi-layer multi-head self-attention mechanism is widely applied in modern neural language models. Attention redundancy has been observed among attention heads but has not been deeply studied in the literature. Using BERT-base model as an example, this paper provides a comprehensive study on attention redundancy which is helpful for model interpretation and model compression. We analyze the attention redundancy with Five-Ws and How. (What) We define and focus the study on redundancy matrices generated from pre-trained and fine-tuned BERT-base model for GLUE datasets. (How) We use both token-based and sentence-based distance functions to measure the redundancy. (Where) Clear and similar redundancy patterns (cluster structure) are observed among attention heads. (When) Redundancy patterns are similar in both pre-training and fine-tuning phases. (Who) We discover that redundancy patterns are task-agnostic. Similar redundancy patterns even exist for randomly generated token sequences. (“Why”) We also evaluate influences of the pre-training dropout ratios on attention redundancy. Based on the phase-independent and task-agnostic attention redundancy patterns, we propose a simple zero-shot pruning method as a case study. Experiments on fine-tuning GLUE tasks verify its effectiveness. The comprehensive analyses on attention redundancy make model understanding and zero-shot model pruning promising.

pdf
Data Collection vs. Knowledge Graph Completion: What is Needed to Improve Coverage?
Kenneth Church | Yuchen Bian
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

This survey/position paper discusses ways to improve coverage of resources such as WordNet. Rapp estimated correlations, rho, between corpus statistics and pyscholinguistic norms. rho improves with quantity (corpus size) and quality (balance). 1M words is enough for simple estimates (unigram frequencies), but at least 100x more is required for good estimates of word associations and embeddings. Given such estimates, WordNet’s coverage is remarkable. WordNet was developed on SemCor, a small sample (200k words) from the Brown Corpus. Knowledge Graph Completion (KGC) attempts to learn missing links from subsets. But Rapp’s estimates of sizes suggest it would be more profitable to collect more data than to infer missing information that is not there.

pdf bib
Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future
Kenneth Church | Mark Liberman | Valia Kordoni
Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future

pdf bib
Benchmarking: Past, Present and Future
Kenneth Church | Mark Liberman | Valia Kordoni
Proceedings of the 1st Workshop on Benchmarking: Past, Present and Future

Where have we been, and where are we going? It is easier to talk about the past than the future. These days, benchmarks evolve more bottom up (such as papers with code). There used to be more top-down leadership from government (and industry, in the case of systems, with benchmarks such as SPEC). Going forward, there may be more top-down leadership from organizations like MLPerf and/or influencers like David Ferrucci, who was responsible for IBM’s success with Jeopardy, and has recently written a paper suggesting how the community should think about benchmarking for machine comprehension. Tasks such as reading comprehension become even more interesting as we move beyond English. Multilinguality introduces many challenges, and even more opportunities.

2020

pdf
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework
Mingbo Ma | Baigong Zheng | Kaibo Liu | Renjie Zheng | Hairong Liu | Kainan Peng | Kenneth Church | Liang Huang
Findings of the Association for Computational Linguistics: EMNLP 2020

Text-to-speech synthesis (TTS) has witnessed rapid progress in recent years, where neural methods became capable of producing audios with high naturalness. However, these efforts still suffer from two types of latencies: (a) the computational latency (synthesizing time), which grows linearly with the sentence length, and (b) the input latency in scenarios where the input text is incrementally available (such as in simultaneous translation, dialog generation, and assistive technologies). To reduce these latencies, we propose a neural incremental TTS approach using the prefix-to-prefix framework from simultaneous translation. We synthesize speech in an online fashion, playing a segment of audio while generating the next, resulting in an O(1) rather than O(n) latency. Experiments on English and Chinese TTS show that our approach achieves similar speech naturalness compared to full sentence TTS, but only with a constant (1-2 words) latency.

pdf
Fluent and Low-latency Simultaneous Speech-to-Speech Translation with Self-adaptive Training
Renjie Zheng | Mingbo Ma | Baigong Zheng | Kaibo Liu | Jiahong Yuan | Kenneth Church | Liang Huang
Findings of the Association for Computational Linguistics: EMNLP 2020

Simultaneous speech-to-speech translation is an extremely challenging but widely useful scenario that aims to generate target-language speech only a few seconds behind the source-language speech. In addition, we have to continuously translate a speech of multiple sentences, but all recent solutions merely focus on the single-sentence scenario. As a result, current approaches will accumulate more and more latencies in later sentences when the speaker talks faster and introduce unnatural pauses into translated speech when the speaker talks slower. To overcome these issues, we propose Self-Adaptive Translation which flexibly adjusts the length of translations to accommodate different source speech rates. At similar levels of translation quality (as measured by BLEU), our method generates more fluent target speech latency than the baseline, in both Zh<->En directions.

pdf
Improving Bilingual Lexicon Induction for Low Frequency Words
Jiaji Huang | Xingyu Cai | Kenneth Church
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for rare words. First, a diminishing margin between similarities in low frequency regime, and secondly, exacerbated hubness at low frequency. Based on the observation, we further propose two methods to address these two factors, respectively. The larger issue is hubness. Addressing that improves induction accuracy significantly, especially for low-frequency words.

2019

pdf
Hubless Nearest Neighbor Search for Bilingual Lexicon Induction
Jiaji Huang | Qiang Qiu | Kenneth Church
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Bilingual Lexicon Induction (BLI) is the task of translating words from corpora in two languages. Recent advances in BLI work by aligning the two word embedding spaces. Following that, a key step is to retrieve the nearest neighbor (NN) in the target space given the source word. However, a phenomenon called hubness often degrades the accuracy of NN. Hubness appears as some data points, called hubs, being extra-ordinarily close to many of the other data points. Reducing hubness is necessary for retrieval tasks. One successful example is Inverted SoFtmax (ISF), recently proposed to improve NN. This work proposes a new method, Hubless Nearest Neighbor (HNN), to mitigate hubness. HNN differs from NN by imposing an additional equal preference assumption. Moreover, the HNN formulation explains why ISF works as well as it does. Empirical results demonstrate that HNN outperforms NN, ISF and other state-of-the-art. For reproducibility and follow-ups, we have published all code.

2016

pdf
C2D2E2: Using Call Centers to Motivate the Use of Dialog and Diarization in Entity Extraction
Ken Church | Weizhong Zhu | Jason Pelecanos
Proceedings of the Workshop on Uphill Battles in Language Processing: Scaling Early Achievements to Robust Methods

2014

pdf bib
The Case for Empiricism (With and Without Statistics)
Kenneth Church
Proceedings of Frame Semantics in NLP: A Workshop in Honor of Chuck Fillmore (1929-2014)

2011

pdf
How Many Multiword Expressions do People Know?
Kenneth Church
Proceedings of the Workshop on Multiword Expressions: from Parsing and Generation to the Real World

pdf
Using Large Monolingual and Bilingual Corpora to Improve Coordination Disambiguation
Shane Bergsma | David Yarowsky | Kenneth Church
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

pdf
A Fast Re-scoring Strategy to Capture Long-Distance Dependencies
Anoop Deoras | Tomáš Mikolov | Kenneth Church
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib
Proceedings of the IJCNLP 2011 System Demonstrations
Kenneth Church | Yunqing Xia
Proceedings of the IJCNLP 2011 System Demonstrations

2010

pdf
New Tools for Web-Scale N-grams
Dekang Lin | Kenneth Church | Heng Ji | Satoshi Sekine | David Yarowsky | Shane Bergsma | Kailash Patil | Emily Pitler | Rachel Lathbury | Vikram Rao | Kapil Dalwani | Sushant Narsale
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

While the web provides a fantastic linguistic resource, collecting and processing data at web-scale is beyond the reach of most academic laboratories. Previous research has relied on search engines to collect online information, but this is hopelessly inefficient for building large-scale linguistic resources, such as lists of named-entity types or clusters of distributionally similar words. An alternative to processing web-scale text directly is to use the information provided in an N-gram corpus. An N-gram corpus is an efficient compression of large amounts of text. An N-gram corpus states how often each sequence of words (up to length N) occurs. We propose tools for working with enhanced web-scale N-gram corpora that include richer levels of source annotation, such as part-of-speech tags. We describe a new set of search tools that make use of these tags, and collectively lower the barrier for lexical learning and ambiguity resolution at web-scale. They will allow novel sources of information to be applied to long-standing natural language challenges.

pdf
NLP on Spoken Documents Without ASR
Mark Dredze | Aren Jansen | Glen Coppersmith | Ken Church
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

pdf
Using Web-scale N-grams to Improve Base NP Parsing Performance
Emily Pitler | Shane Bergsma | Dekang Lin | Kenneth Church
Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010)

2009

pdf
Using Word-Sense Disambiguation Methods to Classify Web Queries by Intent
Emily Pitler | Ken Church
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Repetition and Language Models and Comparable Corpora
Ken Church
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora (BUCC)

2007

pdf
K-Best Suffix Arrays
Kenneth Church | Bo Thiesson | Robert Ragno
Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics; Companion Volume, Short Papers

pdf
Compressing Trigram Language Models With Golomb Coding
Kenneth Church | Ted Hart | Jianfeng Gao
Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

pdf
A Sketch Algorithm for Estimating Two-Way and Multi-Way Associations
Ping Li | Kenneth W. Church
Computational Linguistics, Volume 33, Number 3, September 2007

2005

pdf
The Wild Thing
Ken Church | Bo Thiesson
Proceedings of the ACL Interactive Poster and Demonstration Sessions

pdf
Last Words: Reviewing the Reviewers
Kenneth Church
Computational Linguistics, Volume 31, Number 4, December 2005

pdf
Using Sketches to Estimate Associations
Ping Li | Kenneth W. Church
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing

2002

pdf
NLP Found Helpful (at least for one Text Categorization Task)
Carl Sable | Kathleen McKeown | Kenneth Church
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP 2002)

2001

pdf
Using Bins to Empirically Estimate Term Weights for Text Categorization
Carl Sable | Kenneth W. Church
Proceedings of the 2001 Conference on Empirical Methods in Natural Language Processing

pdf bib
Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus
Mikio Yamamoto | Kenneth W. Church
Computational Linguistics, Volume 27, Number 1, March 2001

2000

pdf
Empirical Estimates of Adaptation: The chance of Two Noriegas is closer to p/2 than p2
Kenneth W. Church
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

pdf
Empirical Term Weighting and Expansion Frequency
Kyoji Umemura | Kenneth W. Church
2000 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1999

pdf bib
What’s Happened Since the First SIGDAT Meeting?
Kenneth Ward Church
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

1998

pdf
Using Suffix Arrays to Compute Term Frequency and Document Frequency for All Substrings in a Corpus
Mikio Yamamoto | Kenneth W. Church
Sixth Workshop on Very Large Corpora

1996

pdf
Panel: The limits of automation: optimists vs skeptics.
Eduard Hovy | Ken Church | Denis Gachot | Marge Leon | Alan Melby | Sergei Nirenburg | Yorick Wilks
Conference of the Association for Machine Translation in the Americas

1995

pdf
Inverse Document Frequency (IDF): A Measure of Deviations from Poisson
Kenneth Church | William Gale
Third Workshop on Very Large Corpora

1994

pdf
Is MT Research Doing Any Good?
Kenneth Church | Bonnie Dorr | Eduard Hovy | Sergei Nirenburg | Bernard Scott | Virginia Teller
Proceedings of the First Conference of the Association for Machine Translation in the Americas

pdf
Fax: An Alternative to SGML
Kenneth W. Church | William A. Gale | Jonathan I. Helfman | David D. Lewis
COLING 1994 Volume 1: The 15th International Conference on Computational Linguistics

pdf
K-vec: A New Approach for Aligning Parallel Texts
Pascale Fung | Kenneth Ward Church
COLING 1994 Volume 2: The 15th International Conference on Computational Linguistics

pdf
Termight: Identifying and Translating Technical Terminology
Ido Dagan | Ken Church
Fourth Conference on Applied Natural Language Processing

1993

pdf bib
Char_align: A Program for Aligning Parallel Texts at the Character Level
Kenneth Ward Church
31st Annual Meeting of the Association for Computational Linguistics

pdf bib
Robust Bilingual Word Alignment for Machine Aided Translation
Ido Dagan | Kenneth Church | Willian Gale
Very Large Corpora: Academic and Industrial Perspectives

pdf bib
Introduction to the Special Issue on Computational Linguistics Using Large Corpora
Kenneth W. Church | Robert L. Mercer
Computational Linguistics, Volume 19, Number 1, March 1993, Special Issue on Using Large Corpora: I

pdf
A Program for Aligning Sentences in Bilingual Corpora
William A. Gale | Kenneth W. Church
Computational Linguistics, Volume 19, Number 1, March 1993, Special Issue on Using Large Corpora: I

1992

pdf
Using bilingual materials to develop word sense disambiguation methods
William A. Gale | Kenneth W. Church | David Yarowsky
Proceedings of the Fourth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf
Estimating Upper and Lower Bounds on the Performance of Word-Sense Disambiguation Programs
William Gale | Kenneth Ward Church | David Yarowsky
30th Annual Meeting of the Association for Computational Linguistics

pdf
One Sense Per Discourse
William A. Gale | Kenneth W. Church | David Yarowsky
Speech and Natural Language: Proceedings of a Workshop Held at Harriman, New York, February 23-26, 1992

1991

pdf
Identifying Word Correspondences in Parallel Texts
William A. Gale | Kenneth W. Church
Speech and Natural Language: Proceedings of a Workshop Held at Pacific Grove, California, February 19-22, 1991

pdf
Book Reviews: Theory and Practice in Corpus Linguistics
Kenneth Ward Church
Computational Linguistics, Volume 17, Number 1, March 1991

pdf
A Program for Aligning Sentences in Bilingual Corpora
William A. Gale | Kenneth W. Church
29th Annual Meeting of the Association for Computational Linguistics

1990

pdf
A Spelling Correction Program Based on a Noisy Channel Model
Mark D. Kernighan | Kenneth W. Church | William A. Gale
COLING 1990 Volume 2: Papers presented to the 13th International Conference on Computational Linguistics

pdf
Poor Estimates of Context are Worse than None
William A. Gale | Kenneth W. Church
Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, Pennsylvania, June 24-27,1990

pdf
Word Association Norms, Mutual Information, and Lexicography
Kenneth Ward Church | Patrick Hanks
Computational Linguistics, Volume 16, Number 1, March 1990

1989

pdf
Parsing, Word Associations and Typical Predicate-Argument Relations
Kenneth Church | William Gale | Patrick Hanks | Donald Hindle
Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, Massachusetts, October 15-18, 1989

pdf
Enhanced Good-Turing and Cat-Cal: Two New Methods for Estimating Probabilities of English Bigrams (abbreviated version)
Kenneth W. Church | William A. Gale
Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, Massachusetts, October 15-18, 1989

pdf
Session 11 Natural Language III
Kenneth Ward Church
Speech and Natural Language: Proceedings of a Workshop Held at Cape Cod, Massachusetts, October 15-18, 1989

pdf
Word Association Norms, Mutual Information, and Lexicography
Kenneth Ward Church | Patrick Hanks
27th Annual Meeting of the Association for Computational Linguistics

pdf
Parsing, Word Associations and Typical Predicate-Argument Relations
Kenneth Church | William Gale | Patrick Hanks | Donald Hindle
Proceedings of the First International Workshop on Parsing Technologies

There are a number of collocational constraints in natural languages that ought to play a more important role in natural language parsers. Thus, for example, it is hard for most parsers to take advantage of the fact that wine is typically drunk, produced, and sold, but (probably) not pruned. So too, it is hard for a parser to know which verbs go with which prepositions (e.g., set up) and which nouns fit together to form compound noun phrases (e.g., computer programmer). This paper will attempt to show that many of these types of concerns can be addressed with syntactic methods (symbol pushing), and need not require explicit semantic interpretation. We have found that it is possible to identify many of these interesting co-occurrence relations by computing simple summary statistics over millions of words of text. This paper will summarize a number of experiments carried out by various subsets of the authors over the last few years. The term collocation will be used quite broadly to include constraints on SVO (subject verb object) triples, phrasal verbs, compound noun phrases, and psychoiinguistic notions of word association (e.g., doctor/nurse).

1988

pdf
Complexity, Two-Level Morphology and Finnish
Kimmo Koskenniemi | Kenneth Ward Church
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics

pdf
A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text
Kenneth Ward Church
Second Conference on Applied Natural Language Processing

1986

pdf
Morphological Decomposition and Stress Assignment for Speech Synthesis
Kenneth Church
24th Annual Meeting of the Association for Computational Linguistics

1985

pdf
Stress Assignment in Letter to Sound Rules for Speech Synthesis
Kenneth Church
23rd Annual Meeting of the Association for Computational Linguistics

1983

pdf
A Finite-State Parser for Use in Speech Recognition
Kenneth W. Church
21st Annual Meeting of the Association for Computational Linguistics

1982

pdf
Coping with Syntactic Ambiguity or How to Put the Block in the Box on the Table
Kenneth Church | Ramesh Patil
American Journal of Computational Linguistics, Volume 8, Number 3-4, July-December 1982

1980

pdf
On Parsing Strategies and Closure
Kenneth Church
18th Annual Meeting of the Association for Computational Linguistics