Bob Carpenter


2018

pdf
Comparing Bayesian Models of Annotation
Silviu Paun | Bob Carpenter | Jon Chamberlain | Dirk Hovy | Udo Kruschwitz | Massimo Poesio
Transactions of the Association for Computational Linguistics, Volume 6

The analysis of crowdsourced annotations in natural language processing is concerned with identifying (1) gold standard labels, (2) annotator accuracies and biases, and (3) item difficulties and error patterns. Traditionally, majority voting was used for 1, and coefficients of agreement for 2 and 3. Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation.

2014

pdf
The Benefits of a Model of Annotation
Rebecca J. Passonneau | Bob Carpenter
Transactions of the Association for Computational Linguistics, Volume 2

Standard agreement measures for interannotator reliability are neither necessary nor sufficient to ensure a high quality corpus. In a case study of word sense annotation, conventional methods for evaluating labels from trained annotators are contrasted with a probabilistic annotation model applied to crowdsourced data. The annotation model provides far more information, including a certainty measure for each gold standard label; the crowdsourced data was collected at less than half the cost of the conventional approach.

2013

pdf
The Benefits of a Model of Annotation
Rebecca J. Passonneau | Bob Carpenter
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

2008

pdf bib
Software Engineering, Testing, and Quality Assurance for Natural Language Processing
K. Bretonnel Cohen | Bob Carpenter
Software Engineering, Testing, and Quality Assurance for Natural Language Processing

2007

pdf bib
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT)
Bob Carpenter | Amanda Stent | Jason D. Williams
Proceedings of Human Language Technologies: The Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT)

2006

pdf
Character Language Models for Chinese Word Segmentation and Named Entity Recognition
Bob Carpenter
Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing

2005

pdf
Scaling High-Order Character Language Models to Gigabytes
Bob Carpenter
Proceedings of Workshop on Software

pdf
Switch Graphs for Parsing Type Logical Grammars
Bob Carpenter | Glyn Morrill
Proceedings of the Ninth International Workshop on Parsing Technology

2004

pdf
Head-Driven Parsing for Word Lattices
Christopher Collins | Bob Carpenter | Gerald Penn
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04)

2003

pdf bib
Alias-i Threat Trackers
Breck Baldwin | Bob Carpenter | Aaron Ross
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations

1999

pdf
Vector-based Natural Language Call Routing
Jennifer Chu-Carroll | Bob Carpenter
Computational Linguistics, Volume 25, Number 3, September 1999

1998

pdf
Dialogue Management in Vector-Based Call Routing
Jennifer Chu-Carroll | Bob Carpenter
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1

pdf
Dialogue Management in Vector-Based Call Routing
Jennifer Chu-Carroll | Bob Carpenter
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics

1997

pdf
Probabilistic Parsing using Left Corner Language Models
Christopher D. Manning | Bob Carpenter
Proceedings of the Fifth International Workshop on Parsing Technologies

We introduce a novel parser based on a probabilistic version of a left-corner parser. The left-corner strategy is attractive because rule probabilities can be conditioned on both top-down goals and bottom-up derivations. We develop the underlying theory and explain how a grammar can be induced from analyzed data. We show that the left-corner approach provides an advantage over simple top-down probabilistic context-free grammars in parsing the Wall Street Journal using a grammar induced from the Penn Treebank. We also conclude that the Penn Treebank provides a fairly weak tes bed due to the flatness of its bracketings and to the obvious overgeneration and undergeneration of its induced grammar.

1995

pdf
An Abstract Machine for Attribute-Value Logics
Bob Carpenter | Yan Qu
Proceedings of the Fourth International Workshop on Parsing Technologies

A direct abstract machine implementation of the core attribute-value logic operations is shown to decrease the number of operations and conserve the amount of storage required when compared to interpreters or indirect compilers. In this paper, we describe the fundamental data structures and compilation techniques that we have employed to develop a unification and constraint-resolution engine capable of performance rivaling that of directly compiled Prolog terms while greatly exceeding Prolog in flexibility, expressiveness and modularity. In this paper, we will discuss the core architecture of our machine. We begin with a survey of the data structures supporting the small set of attribute-value logic instructions. These instructions manipulate feature structures by means of features, equality, and typing, and manipulate the program state by search and sequencing operations. We further show how these core operations can be integrated with a broad range of standard parsing techniques. Feature structures improve upon Prolog terms by allowing data to be organized by feature rather than by position. This encourages modular program development through the use of sparse structural descriptions which can be logically conjoined into larger units and directly executed. Standard linguistic representations, even of relatively simple local syntactic and semantic structures, typically run to hundreds of substructures. The type discipline we impose organizes information in an object-oriented manner by the multiple inheritance of classes and their associated features and type value constraints. In practice, this allows the construction of large-scale grammars in a relatively short period of time. At run-time, eager copying and structure-sharing is replaced with lazy, incremental, and localized branch and write operations. In order to allow for applications with parallel search, incremental backtracking can be localized to disjunctive choice points within the description of a single structure, thus supporting the kind of conditional mutual consistency checks used in modern grammatical theories such as HPSG, GB, and LFG. Further attention is paid to the byte-coding of instructions and their efficient indexing and subsequent retrieval, all of which is keyed on type information.

pdf
Computational phonology: A constraint-based approach
Deirdre Wheeler | Bob Carpenter
Computational Linguistics, Volume 21, Number 4, December 1995

1994

pdf bib
Constraint-based Morpho-phonology
Michael Mastroianni | Bob Carpenter
Computational Phonology

1993

pdf
Compiling Typed Attribute-Value Logic Grammars
Bob Carpenter
Proceedings of the Third International Workshop on Parsing Technologies

The unification-based approach to processing attribute-value logic grammars, similar to Prolog interpretation, has become the standard. We propose an alternative, embodied in the Attribute-Logic Engine (ALE) (Carpenter 1993) , based on the Warren Abstract Machine (WAM) approach to compiling Prolog (Aït-Kaci 1991). Phrase structure grammars with procedural attachments, similar to Definite Clause Grammars (DCG) (Pereira — Warren 1980), are specified using a typed version of Rounds-Kasper logic (Carpenter 1992). We argue for the benefits of a strong and total version of typing in terms of both clarity and efficiency. Finally, we discuss the compilation of grammars into a few efficient low-level instructions for the basic feature structure operations.

1991

pdf
The Generative Power of Categorial Grammars and Head-Driven Phrase Structure Grammars with Lexical Rules
Bob Carpenter
Computational Linguistics, Volume 17, Number 3, September 1991

pdf bib
Inclusion, Disjointness and Choice: The Logic of Linguistic Classification
Bob Carpenter | Carl Pollard
29th Annual Meeting of the Association for Computational Linguistics