2016
pdf
abs
A Tangled Web: The Faint Signals of Deception in Text - Boulder Lies and Truth Corpus (BLT-C)
Franco Salvetti
|
John B. Lowe
|
James H. Martin
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
We present an approach to creating corpora for use in detecting deception in text, including a discussion of the challenges peculiar to this task. Our approach is based on soliciting several types of reviews from writers and was implemented using Amazon Mechanical Turk. We describe the multi-dimensional corpus of reviews built using this approach, available free of charge from LDC as the Boulder Lies and Truth Corpus (BLT-C). Challenges for both corpus creation and the deception detection include the fact that human performance on the task is typically at chance, that the signal is faint, that paid writers such as turkers are sometimes deceptive, and that deception is a complex human behavior; manifestations of deception depend on details of domain, intrinsic properties of the deceiver (such as education, linguistic competence, and the nature of the intention), and specifics of the deceptive act (e.g., lying vs. fabricating.) To overcome the inherent lack of ground truth, we have developed a set of semi-automatic techniques to ensure corpus validity. We present some preliminary results on the task of deception detection which suggest that the BLT-C is an improvement in the quality of resources available for this task.
2008
pdf
Improving Search Results Quality by Customizing Summary Lengths
Michael Kaisser
|
Marti A. Hearst
|
John B. Lowe
Proceedings of ACL-08: HLT
pdf
abs
Creating a Research Collection of Question Answer Sentence Pairs with Amazon’s Mechanical Turk
Michael Kaisser
|
John Lowe
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Each year NIST releases a set of question, document id, answer-triples for the factoid questions used in the TREC Question Answering track. While this resource is widely used and proved itself useful for many purposes, it also is too coarse a grain-size for a lot of other purposes. In this paper we describe how we have used Amazons Mechanical Turk to have multiple subjects read the documents and identify the sentences themselves which contain the answer. For most of the 1911 questions in the test sets from 2002 to 2006 and each of the documents said to contain an answer, the Question-Answer Sentence Pairs (QASP) corpus introduced in this paper contains the identified answer sentences. We believe that this corpus, which we will make available to the public, can further stimulate research in QA, especially linguistically motivated research, where matching the question to the answer sentence by either syntactic or semantic means is a central concern.
1998
pdf
The Berkeley FrameNet Project
Collin F. Baker
|
Charles J. Fillmore
|
John B. Lowe
COLING 1998 Volume 1: The 17th International Conference on Computational Linguistics
pdf
The Berkeley FrameNet Project
Collin F. Baker
|
Charles J. Fillmore
|
John B. Lowe
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 1
1997
pdf
A Frame-Semantic Approach to Semantic Annotation
John B. Lowe
Tagging Text with Lexical Semantics: Why, What, and How?
1994
pdf
The Reconstruction Engine: A Computer Implementation of the Comparative Method
John B. Lowe
|
Martine Mazaudon
Computational Linguistics, Volume 20, Number 3, September 1994