Shaurya Rawat
2020
Hypernym-LIBre: A Free Web-based Corpus for Hypernym Detection
Shaurya Rawat
|
Mariano Rico
|
Oscar Corcho
Proceedings of the 12th Web as Corpus Workshop
In this paper, we describe a new web-based corpus for hypernym detection. It consists of 32 GB of high quality english paragraphs along with their part-of-speech tagged and dependency parsed versions. For hypernym detection, the current state-of-the-art uses a corpus which is not available freely. We evaluate the state-of-the-art methods on our corpus and achieve similar results. The advantage of this corpora is that it is available under an open license. Our main contribution is the corpus with POS-tags and dependency tags and the code to extract and simulate the results we have achieved using our corpus.