2016
pdf
abs
Multilingual Supervision of Semantic Annotation
Peter Exner
|
Marcus Klang
|
Pierre Nugues
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
In this paper, we investigate the annotation projection of semantic units in a practical setting. Previous approaches have focused on using parallel corpora for semantic transfer. We evaluate an alternative approach using loosely parallel corpora that does not require the corpora to be exact translations of each other. We developed a method that transfers semantic annotations from one language to another using sentences aligned by entities, and we extended it to include alignments by entity-like linguistic units. We conducted our experiments on a large scale using the English, Swedish, and French language editions of Wikipedia. Our results show that the annotation projection using entities in combination with loosely parallel corpora provides a viable approach to extending previous attempts. In addition, it allows the generation of proposition banks upon which semantic parsers can be trained.
2015
pdf
A Distant Supervision Approach to Semantic Role Labeling
Peter Exner
|
Marcus Klang
|
Pierre Nugues
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics
2014
pdf
abs
REFRACTIVE: An Open Source Tool to Extract Knowledge from Syntactic and Semantic Relations
Peter Exner
|
Pierre Nugues
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
The extraction of semantic propositions has proven instrumental in applications like IBM Watson and in Google’s knowledge graph . One of the core components of IBM Watson is the PRISMATIC knowledge base consisting of one billion propositions extracted from the English version of Wikipedia and the New York Times. However, extracting the propositions from the English version of Wikipedia is a time-consuming process. In practice, this task requires multiple machines and a computation distribution involving a good deal of system technicalities. In this paper, we describe Refractive, an open-source tool to extract propositions from a parsed corpus based on the Hadoop variant of MapReduce. While the complete process consists of a parsing part and an extraction part, we focus here on the extraction from the parsed corpus and we hope this tool will help computational linguists speed up the development of applications.
2012
pdf
Using Syntactic Dependencies to Solve Coreferences
Marcus Stamborg
|
Dennis Medved
|
Peter Exner
|
Pierre Nugues
Joint Conference on EMNLP and CoNLL - Shared Task
pdf
abs
Constructing Large Proposition Databases
Peter Exner
|
Pierre Nugues
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
With the advent of massive online encyclopedic corpora such as Wikipedia, it has become possible to apply a systematic analysis to a wide range of documents covering a significant part of human knowledge. Using semantic parsers, it has become possible to extract such knowledge in the form of propositions (predicate―argument structures) and build large proposition databases from these documents. This paper describes the creation of multilingual proposition databases using generic semantic dependency parsing. Using Wikipedia, we extracted, processed, clustered, and evaluated a large number of propositions. We built an architecture to provide a complete pipeline dealing with the input of text, extraction of knowledge, storage, and presentation of the resulting propositions.