Bob Carpenter

The analysis of crowdsourced annotations in natural language processing is concerned with identifying (1) gold standard labels, (2) annotator accuracies and biases, and (3) item difficulties and error patterns. Traditionally, majority voting was used for 1, and coefficients of agreement for 2 and 3. Lately, model-based analysis of corpus annotations have proven better at all three tasks. But there has been relatively little work comparing them on the same datasets. This paper aims to fill this gap by analyzing six models of annotation, covering different approaches to annotator ability, item difficulty, and parameter pooling (tying) across annotators and items. We evaluate these models along four aspects: comparison to gold labels, predictive accuracy for new annotations, annotator characterization, and item difficulty, using four datasets with varying degrees of noise in the form of random (spammy) annotators. We conclude with guidelines for model selection, application, and implementation.

2014

Standard agreement measures for interannotator reliability are neither necessary nor sufficient to ensure a high quality corpus. In a case study of word sense annotation, conventional methods for evaluating labels from trained annotators are contrasted with a probabilistic annotation model applied to crowdsourced data. The annotation model provides far more information, including a certainty measure for each gold standard label; the crowdsourced data was collected at less than half the cost of the conventional approach.

2013

2008

2007

2006

2005

2004

2003

1999

1998

1997

We introduce a novel parser based on a probabilistic version of a left-corner parser. The left-corner strategy is attractive because rule probabilities can be conditioned on both top-down goals and bottom-up derivations. We develop the underlying theory and explain how a grammar can be induced from analyzed data. We show that the left-corner approach provides an advantage over simple top-down probabilistic context-free grammars in parsing the Wall Street Journal using a grammar induced from the Penn Treebank. We also conclude that the Penn Treebank provides a fairly weak tes bed due to the flatness of its bracketings and to the obvious overgeneration and undergeneration of its induced grammar.

1995

A direct abstract machine implementation of the core attribute-value logic operations is shown to decrease the number of operations and conserve the amount of storage required when compared to interpreters or indirect compilers. In this paper, we describe the fundamental data structures and compilation techniques that we have employed to develop a unification and constraint-resolution engine capable of performance rivaling that of directly compiled Prolog terms while greatly exceeding Prolog in flexibility, expressiveness and modularity. In this paper, we will discuss the core architecture of our machine. We begin with a survey of the data structures supporting the small set of attribute-value logic instructions. These instructions manipulate feature structures by means of features, equality, and typing, and manipulate the program state by search and sequencing operations. We further show how these core operations can be integrated with a broad range of standard parsing techniques. Feature structures improve upon Prolog terms by allowing data to be organized by feature rather than by position. This encourages modular program development through the use of sparse structural descriptions which can be logically conjoined into larger units and directly executed. Standard linguistic representations, even of relatively simple local syntactic and semantic structures, typically run to hundreds of substructures. The type discipline we impose organizes information in an object-oriented manner by the multiple inheritance of classes and their associated features and type value constraints. In practice, this allows the construction of large-scale grammars in a relatively short period of time. At run-time, eager copying and structure-sharing is replaced with lazy, incremental, and localized branch and write operations. In order to allow for applications with parallel search, incremental backtracking can be localized to disjunctive choice points within the description of a single structure, thus supporting the kind of conditional mutual consistency checks used in modern grammatical theories such as HPSG, GB, and LFG. Further attention is paid to the byte-coding of instructions and their efficient indexing and subsequent retrieval, all of which is keyed on type information.

1994

1993

The unification-based approach to processing attribute-value logic grammars, similar to Prolog interpretation, has become the standard. We propose an alternative, embodied in the Attribute-Logic Engine (ALE) (Carpenter 1993) , based on the Warren Abstract Machine (WAM) approach to compiling Prolog (Aït-Kaci 1991). Phrase structure grammars with procedural attachments, similar to Definite Clause Grammars (DCG) (Pereira — Warren 1980), are specified using a typed version of Rounds-Kasper logic (Carpenter 1992). We argue for the benefits of a strong and total version of typing in terms of both clarity and efficiency. Finally, we discuss the compilation of grammars into a few efficient low-level instructions for the basic feature structure operations.

Fixing paper assignments

2018