A Framework for Compiling High Quality Knowledge Resources From Raw Corpora

Gongye Jin; Daisuke Kawahara; Sadao Kurohashi

A Framework for Compiling High Quality Knowledge Resources From Raw Corpora

Gongye Jin, Daisuke Kawahara, Sadao Kurohashi

Abstract

The identification of various types of relations is a necessary step to allow computers to understand natural language text. In particular, the clarification of relations between predicates and their arguments is essential because predicate-argument structures convey most of the information in natural languages. To precisely capture these relations, wide-coverage knowledge resources are indispensable. Such knowledge resources can be derived from automatic parses of raw corpora, but unfortunately parsing still has not achieved a high enough performance for precise knowledge acquisition. We present a framework for compiling high quality knowledge resources from raw corpora. Our proposed framework selects high quality dependency relations from automatic parses and makes use of them for not only the calculation of fundamental distributional similarity but also the acquisition of knowledge such as case frames.

Anthology ID:: L14-1638
Volume:: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:: May
Year:: 2014
Address:: Reykjavik, Iceland
Editors:: Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association (ELRA)
Note:
Pages:: 109–114
Language:
URL:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/828_Paper.pdf
DOI:
Bibkey:
Cite (ACL):: Gongye Jin, Daisuke Kawahara, and Sadao Kurohashi. 2014. A Framework for Compiling High Quality Knowledge Resources From Raw Corpora. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 109–114, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):: A Framework for Compiling High Quality Knowledge Resources From Raw Corpora (Jin et al., LREC 2014)
Copy Citation:
PDF:: http://www.lrec-conf.org/proceedings/lrec2014/pdf/828_Paper.pdf

PDF Search