Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence

David Manzano-Macho, Asunción Gómez-Pérez, Daniel Borrajo


Abstract
Acquiring knowledge from the Web to build domain ontologies has become a common practice in the Ontological Engineering field. The vast amount of freely available information allows collecting enough information about any domain. However, the Web usually suffers a lack of structure, untrustworthiness and ambiguity of the content. These drawbacks hamper the application of unsupervised methods of building ontologies demanded by the increasingly popular applications of the Semantic Web. We believe that the combination of several processing mechanisms and complementary information sources may potentially solve the problem. The analysis of different sources of evidence allows determining with greater reliability the validity of the detected knowledge. In this paper, we present GALeOn (General Architecture for Learning Ontologies) that combines sources and processing resources to provide complementary and redundant evidence for making better estimations about the relevance of the extracted knowledge and their relationships. Our goal in this paper is to show how combining several information sources and extraction mechanisms is possible to build a taxonomy of concepts with a higher accuracy than if only one of them is applied. The experimental results show how this combination notably increases the precision of the obtained results with minimum user intervention.
Anthology ID:
L08-1179
Volume:
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
Month:
May
Year:
2008
Address:
Marrakech, Morocco
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/418_paper.pdf
DOI:
Bibkey:
Cite (ACL):
David Manzano-Macho, Asunción Gómez-Pérez, and Daniel Borrajo. 2008. Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence. In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08), Marrakech, Morocco. European Language Resources Association (ELRA).
Cite (Informal):
Unsupervised and Domain Independent Ontology Learning: Combining Heterogeneous Sources of Evidence (Manzano-Macho et al., LREC 2008)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2008/pdf/418_paper.pdf