Towards Data-driven Ontologies: a Filtering Approach using Keywords and Natural Language Constructs

Maaike de Boer, Jack P. C. Verhoosel


Abstract
Creating ontologies is an expensive task. Our vision is that we can automatically generate ontologies based on a set of relevant documents to create a kick-start in ontology creating sessions. In this paper, we focus on enhancing two often used methods, OpenIE and co-occurrences. We evaluate the methods on two document sets, one about pizza and one about the agriculture domain. The methods are evaluated using two types of F1-score (objective, quantitative) and through a human assessment (subjective, qualitative). The results show that 1) Cooc performs both objectively and subjectively better than OpenIE; 2) the filtering methods based on keywords and on Word2vec perform similarly; 3) the filtering methods both perform better compared to OpenIE and similar to Cooc; 4) Cooc-NVP performs best, especially considering the subjective evaluation. Although, the investigated methods provide a good start for extracting an ontology out of a set of domain documents, various improvements are still possible, especially in the natural language based methods.
Anthology ID:
2020.lrec-1.278
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
2285–2292
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.278
DOI:
Bibkey:
Cite (ACL):
Maaike de Boer and Jack P. C. Verhoosel. 2020. Towards Data-driven Ontologies: a Filtering Approach using Keywords and Natural Language Constructs. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2285–2292, Marseille, France. European Language Resources Association.
Cite (Informal):
Towards Data-driven Ontologies: a Filtering Approach using Keywords and Natural Language Constructs (de Boer & Verhoosel, LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2020.lrec-1.278.pdf