Akimichi Tanaka


2010

pdf
Study of Word Sense Disambiguation System that uses Contextual Features - Approach of Combining Associative Concept Dictionary and Corpus -
Kyota Tsutsumida | Jun Okamoto | Shun Ishizaki | Makoto Nakatsuji | Akimichi Tanaka | Tadasu Uchiyama
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We propose a Word Sense Disambiguation (WSD) method that accurately classifies ambiguous words to concepts in the Associative Concept Dictionary (ACD) even when the test corpus and the training corpus for WSD are acquired from different domains. Many WSD studies determine the context of the target ambiguous word by analyzing sentences containing the target word. However, they offer poor performance when they are applied to a corpus that differs from the training corpus. One solution is to use associated words that are domain-independently assigned to the concept in ACD; i.e. many users commonly imagine those words against a given concept. Furthermore, by using the associated words of a concept as search queries for a training corpus, our method extracts relevant words, those that are computationally judged to be related to that concept. By checking the frequency of associated words and relevant words that appear near to the target word in a sentence in the test corpus, our method classifies the target word to the concept in ACD. Our evaluation using two different types of corpus demonstrates its good accuracy.