Gaoying Cui


Chinese Core Ontology Construction from a Bilingual Term Bank
Yirong Chen | Qin Lu | Wenjie Li | Gaoying Cui
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

A core ontology is a mid-level ontology which bridges the gap between an upper ontology and a domain ontology. Automatic Chinese core ontology construction can help quickly model domain knowledge. A graph based core ontology construction algorithm (COCA) is proposed to automatically construct a core ontology from an English-Chinese bilingual term bank. This algorithm computes the mapping strength from a selected Chinese term to WordNet synset with association to an upper-level SUMO concept. The strength is measured using a graph model integrated with several mapping features from multiple information sources. The features include multiple translation feature between Chinese core term and WordNet, extended string feature and Part-of-Speech feature. Evaluation of COCA repeated on an English-Chinese bilingual Term bank with more than 130K entries shows that the algorithm is improved in performance compared with our previous research and can better serve the semi-automatic construction of mid-level ontology.

Corpus Exploitation from Wikipedia for Ontology Construction
Gaoying Cui | Qin Lu | Wenjie Li | Yirong Chen
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

Ontology construction usually requires a domain-specific corpus for building corresponding concept hierarchy. The domain corpus must have a good coverage of domain knowledge. Wikipedia(Wiki), the world’s largest online encyclopaedic knowledge source, is open-content, collaboratively edited, and free of charge. It covers millions of articles and still keeps on expanding continuously. These characteristics make Wiki a good candidate as domain corpus resource in ontology construction. However, the selected article collection must have considerable quality and quantity. In this paper, a novel approach is proposed to identify articles in Wiki as domain-specific corpus by using available classification information in Wiki pages. The main idea is to generate a domain hierarchy from the hyperlinked pages of Wiki. Only articles strongly linked to this hierarchy are selected as the domain corpus. The proposed approach makes use of linked category information in Wiki pages to produce the hierarchy as a directed graph for obtaining a set of pages in the same connected branch. Ranking and filtering are then done on these pages based on the classification tree generated by the traversal algorithm. The experiment and evaluation results show that Wiki is a good resource for acquiring a relative high quality domain-specific corpus for ontology construction.

Preliminary Chinese Term Classification for Ontology Construction
Gaoying Cui | Qin Lu | Wenjie Li
Proceedings of the 6th Workshop on Asian Language Resources


pdf bib
Domain Knowledge Engineering Based on Encyclopedias and the Web Text
Zhifang Sui | Gaoying Cui | Wansong Ding | Qinlong Zhang
Proceedings of the Fifth Workshop on Asian Language Resources (ALR-05) and First Symposium on Asian Language Resources Network (ALRN)