Corpus+WordNet thesaurus generation for ontology enriching
Fernando Castilho, Roger Granada, Breno Meneghetti, Leonardo Carvalho, Renata Vieira
Abstract
This paper presents a model to enrich an ontology with a thesaurus based on a domain corpus and WordNet. The model is applied to the data privacy domain and the initial domain resources comprise a data privacy ontology, a corpus of privacy laws, regulations and guidelines for projects. Based on these resources, a thesaurus is automatically generated. The thesaurus seeds are composed by the ontology concepts. For these seeds similar terms are extracted from the corpus using known thesaurus generation methods. A filtering process searches for semantic relations between seeds and similar terms within WordNet. As a result, these semantic relations are used to expand the ontology with relations between them and related terms in the corpus. The resulting resource is a hierarchical structure that can help on the ontology investigation and maintenance. The results allow the investigation of the domain knowledge with the support of semantic relations not present on the original ontology.- Anthology ID:
- L12-1633
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 3463–3467
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/1062_Paper.pdf
- DOI:
- Cite (ACL):
- Fernando Castilho, Roger Granada, Breno Meneghetti, Leonardo Carvalho, and Renata Vieira. 2012. Corpus+WordNet thesaurus generation for ontology enriching. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 3463–3467, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Corpus+WordNet thesaurus generation for ontology enriching (Castilho et al., LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/1062_Paper.pdf