Abstract
Wikipedia has been used as a knowledge source in many areas of natural language processing. As most studies only use a certain Wikipedia snapshot, the influence of Wikipedias massive growth on the results is largely unknown. For the first time, we perform an in-depth analysis of this influence using semantic relatedness as an example application that tests a wide range of Wikipedias properties. We find that the growth of Wikipedia has almost no effect on the correlation of semantic relatedness measures with human judgments, while the coverage steadily increases.- Anthology ID:
- L10-1055
- Volume:
- Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
- Month:
- May
- Year:
- 2010
- Address:
- Valletta, Malta
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/93_Paper.pdf
- DOI:
- Cite (ACL):
- Torsten Zesch and Iryna Gurevych. 2010. The More the Better? Assessing the Influence of Wikipedia’s Growth on Semantic Relatedness Measures. In Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10), Valletta, Malta. European Language Resources Association (ELRA).
- Cite (Informal):
- The More the Better? Assessing the Influence of Wikipedia’s Growth on Semantic Relatedness Measures (Zesch & Gurevych, LREC 2010)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2010/pdf/93_Paper.pdf