Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus

Yasunori Ohishi, Katunobu Itou, Kazuya Takeda, Atsushi Fujii


Abstract
This paper proposes a discrimination method for hierarchical relationsbetween word pairs. The method is a statistical one using an “encyclopedic corpus”' extracted and organized from Web pages.In the proposed method, we use the statistical naturethat hyponyms' descriptionstend to include hypernyms whereas hypernyms' descriptions do notinclude all of the hyponyms.Experimental results show that the method detected 61.7% of therelations in an actual thesaurus.
Anthology ID:
L06-1042
Volume:
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)
Month:
May
Year:
2006
Address:
Genoa, Italy
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/88_pdf.pdf
DOI:
Bibkey:
Cite (ACL):
Yasunori Ohishi, Katunobu Itou, Kazuya Takeda, and Atsushi Fujii. 2006. Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus. In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), Genoa, Italy. European Language Resources Association (ELRA).
Cite (Informal):
Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus (Ohishi et al., LREC 2006)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2006/pdf/88_pdf.pdf