Augmenting a Semantic Verb Lexicon with a Large Scale Collection of Example Sentences

Kentaro Inui; Toru Hirano; Ryu Iida; Atsushi Fujita; Yuji Matsumoto

Augmenting a Semantic Verb Lexicon with a Large Scale Collection of Example Sentences

Kentaro Inui, Toru Hirano, Ryu Iida, Atsushi Fujita, Yuji Matsumoto

Abstract

One of the crucial issues in semantic parsing is how to reduce costs of collecting a sufficiently large amount of labeled data. This paper presents a new approach to cost-saving annotation of example sentences with predicate-argument structure information, taking Japanese as a target language. In this scheme, a large collection of unlabeled examples are first clustered and selectively sampled, and for each sampled cluster, only one representative example is given a label by a human annotator. The advantages of this approach are empirically supported by the results of our preliminary experiments, where we use an existing similarity function and naive sampling strategy.

Anthology ID:: L06-1370
Volume:: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:: May
Year:: 2006
Address:: Genoa, Italy
Editors:: Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, Daniel Tapias
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2006/pdf/610_pdf.pdf
DOI:
Bibkey:
Cite (ACL):: Kentaro Inui, Toru Hirano, Ryu Iida, Atsushi Fujita, and Yuji Matsumoto. 2006. Augmenting a Semantic Verb Lexicon with a Large Scale Collection of Example Sentences. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):: Augmenting a Semantic Verb Lexicon with a Large Scale Collection of Example Sentences (Inui et al., LREC 2006)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2006/pdf/610_pdf.pdf

PDF Search Fix metadata