2010
pdf
abs
GikiCLEF: Crosscultural Issues in Multilingual Information Access
Diana Santos
|
Luís Miguel Cabral
|
Corina Forascu
|
Pamela Forner
|
Fredric Gey
|
Katrin Lamm
|
Thomas Mandl
|
Petya Osenova
|
Anselmo Peñas
|
Álvaro Rodrigo
|
Julia Schulz
|
Yvonne Skalban
|
Erik Tjong Kim Sang
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)
In this paper we describe GikiCLEF, the first evaluation contest that, to our knowledge, was specifically designed to expose and investigate cultural and linguistic issues involved in structured multimedia collections and searching, and which was organized under the scope of CLEF 2009. GikiCLEF evaluated systems that answered hard questions for both human and machine, in ten different Wikipedia collections, namely Bulgarian, Dutch, English, German, Italian, Norwegian (Bokmäl and Nynorsk), Portuguese, Romanian, and Spanish. After a short historical introduction, we present the task, together with its motivation, and discuss how the topics were chosen. Then we provide another description from the point of view of the participants. Before disclosing their results, we introduce the SIGA management system explaining the several tasks which were carried out behind the scenes. We quantify in turn the GIRA resource, offered to the community for training and further evaluating systems with the help of the 50 topics gathered and the solutions identified. We end the paper with a critical discussion of what was learned, advancing possible ways to reuse the data.
2008
pdf
abs
A Japanese-English Technical Lexicon for Translation and Language Research
Fredric Gey
|
David Kirk Evans
|
Noriko Kando
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper we present a Japanese-English Bilingual lexicon of technical terms. The lexicon was derived from the first and second NTCIR evaluation collections for research into cross-language information retrieval for Asian languages. While it can be utilized for translation between Japanese and English, the lexicon is also suitable for language research and language engineering. Since it is collection-derived, it contains instances of word variants and miss-spellings which make it eminently suitable for further research. For a subset of the lexicon we make available the collection statistics. In addition we make available a Katakana subset suitable for transliteration research.
pdf
abs
An Evaluation Resource for Geographic Information Retrieval
Thomas Mandl
|
Fredric Gey
|
Giorgio Di Nunzio
|
Nicola Ferro
|
Mark Sanderson
|
Diana Santos
|
Christa Womser-Hacker
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported
2001
pdf
Entry Vocabulary - a Technology to Enhance Digital Search
Fredric Gey
|
Michael Buckland
|
Aitao Chen
|
Ray Larson
Proceedings of the First International Conference on Human Language Technology Research