Abstract
Wikipedia pages typically contain inter-language links to the corresponding pages in other languages. These links, however, are often incomplete. This paper describes a set of experiments in which the viability of discovering such missing inter-language links for ambiguous nouns by means of a cross-lingual Word Sense Disambiguation approach is investigated. The input for the inter-language link detection system is a set of Dutch pages for a given ambiguous noun and the output of the system is a set of links to the corresponding pages in three target languages (viz. French, Spanish and Italian). The experimental results show that although it is a very challenging task, the system succeeds to detect missing inter-language links between Wikipedia documents for a manually labeled test set. The final goal of the system is to provide a human editor with a list of possible missing links that should be manually verified.- Anthology ID:
- L12-1278
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 841–846
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/508_Paper.pdf
- DOI:
- Cite (ACL):
- Els Lefever, Véronique Hoste, and Martine De Cock. 2012. Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 841–846, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Discovering Missing Wikipedia Inter-language Links by means of Cross-lingual Word Sense Disambiguation (Lefever et al., LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/508_Paper.pdf