Construction of Diachronic Ontologies from People’s Daily of Fifty Years

Shaoda He, Xiaojun Zou, Liumingjing Xiao, Junfeng Hu


Abstract
This paper presents an Ontology Learning From Text (OLFT) method follows the well-known OLFT cake layer framework. Based on the distributional similarity, the proposed method generates multi-level ontologies from comparatively small corpora with the aid of HITS algorithm. Currently, this method covers terms extraction, synonyms recognition, concepts discovery and concepts hierarchical clustering. Among them, both concepts discovery and concepts hierarchical clustering are aided by the HITS authority, which is obtained from the HITS algorithm by an iteratively recommended way. With this method, a set of diachronic ontologies is constructed for each year based on People’s Daily corpora of fifty years (i.e., from 1947 to 1996). Preliminary experiments show that our algorithm outperforms the Google’s RNN and K-means based algorithm in both concepts discovery and concepts hierarchical clustering.
Anthology ID:
L14-1297
Volume:
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
Month:
May
Year:
2014
Address:
Reykjavik, Iceland
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
3258–3263
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/337_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Shaoda He, Xiaojun Zou, Liumingjing Xiao, and Junfeng Hu. 2014. Construction of Diachronic Ontologies from People’s Daily of Fifty Years. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 3258–3263, Reykjavik, Iceland. European Language Resources Association (ELRA).
Cite (Informal):
Construction of Diachronic Ontologies from People’s Daily of Fifty Years (He et al., LREC 2014)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2014/pdf/337_Paper.pdf