Shaoda He


2014

pdf
Construction of Diachronic Ontologies from People’s Daily of Fifty Years
Shaoda He | Xiaojun Zou | Liumingjing Xiao | Junfeng Hu
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper presents an Ontology Learning From Text (OLFT) method follows the well-known OLFT cake layer framework. Based on the distributional similarity, the proposed method generates multi-level ontologies from comparatively small corpora with the aid of HITS algorithm. Currently, this method covers terms extraction, synonyms recognition, concepts discovery and concepts hierarchical clustering. Among them, both concepts discovery and concepts hierarchical clustering are aided by the HITS authority, which is obtained from the HITS algorithm by an iteratively recommended way. With this method, a set of diachronic ontologies is constructed for each year based on People’s Daily corpora of fifty years (i.e., from 1947 to 1996). Preliminary experiments show that our algorithm outperforms the Google’s RNN and K-means based algorithm in both concepts discovery and concepts hierarchical clustering.