Filipe Mesquita


2019

pdf bib
KnowledgeNet: A Benchmark Dataset for Knowledge Base Population
Filipe Mesquita | Matteo Cannaviccio | Jordan Schmidek | Paramita Mirza | Denilson Barbosa
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

KnowledgeNet is a benchmark dataset for the task of automatically populating a knowledge base (Wikidata) with facts expressed in natural language text on the web. KnowledgeNet provides text exhaustively annotated with facts, thus enabling the holistic end-to-end evaluation of knowledge base population systems as a whole, unlike previous benchmarks that are more suitable for the evaluation of individual subcomponents (e.g., entity linking, relation extraction). We discuss five baseline approaches, where the best approach achieves an F1 score of 0.50, significantly outperforming a traditional approach by 79% (0.28). However, our best baseline is far from reaching human performance (0.82), indicating our dataset is challenging. The KnowledgeNet dataset and baselines are available at https://github.com/diffbot/knowledge-net

2013

pdf bib
Effectiveness and Efficiency of Open Relation Extraction
Filipe Mesquita | Jordan Schmidek | Denilson Barbosa
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
Automatic Evaluation of Relation Extraction Systems on Large-scale
Mirko Bronzi | Zhaochen Guo | Filipe Mesquita | Denilson Barbosa | Paolo Merialdo
Proceedings of the Joint Workshop on Automatic Knowledge Base Construction and Web-scale Knowledge Extraction (AKBC-WEKEX)