Constructing Large Proposition Databases

Peter Exner, Pierre Nugues


Abstract
With the advent of massive online encyclopedic corpora such as Wikipedia, it has become possible to apply a systematic analysis to a wide range of documents covering a significant part of human knowledge. Using semantic parsers, it has become possible to extract such knowledge in the form of propositions (predicate―argument structures) and build large proposition databases from these documents. This paper describes the creation of multilingual proposition databases using generic semantic dependency parsing. Using Wikipedia, we extracted, processed, clustered, and evaluated a large number of propositions. We built an architecture to provide a complete pipeline dealing with the input of text, extraction of knowledge, storage, and presentation of the resulting propositions.
Anthology ID:
L12-1238
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3836–3840
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/452_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Peter Exner and Pierre Nugues. 2012. Constructing Large Proposition Databases. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3836–3840, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Constructing Large Proposition Databases (Exner & Nugues, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/452_Paper.pdf