Modeling Wikipedia Articles to Enhance Encyclopedic Search

Atsushi Fujii


Abstract
Reflecting the rapid growth of science, technology, and culture, it has become common practice to consult tools on the World Wide Web for various terms. Existing search engines provide an enormous volume of information, but retrieved information is not organized. Hand-compiled encyclopedias provide organized information, but the quantity of information is limited. To integrate the advantages of both tools, we have been proposing methods for encyclopedic search targeting information on the Web and patent information. In this paper, we propose a method to categorize multiple expository texts for a single term based on viewpoints. Because viewpoints required for explanation are different depending on the type of a term, such as animals and diseases, it is difficult to manually produce a large scale system. We use Wikipedia to extract a prototype of a viewpoint structure for each term type. We also use articles in Wikipedia for a machine learning method, which categorizes a given text into an appropriate viewpoint. We evaluate the effectiveness of our method experimentally.
Anthology ID:
L10-1471
Volume:
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
Month:
May
Year:
2010
Address:
Valletta, Malta
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/684_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Atsushi Fujii. 2010. Modeling Wikipedia Articles to Enhance Encyclopedic Search. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
Cite (Informal):
Modeling Wikipedia Articles to Enhance Encyclopedic Search (Fujii, LREC 2010)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2010/pdf/684_Paper.pdf