Gareth J. F. Jones

Also published as: Gareth J.F. Jones


Porting a Summarizer to the French Language
Rémi Bois | Johannes Leveling | Lorraine Goeuriot | Gareth J. F. Jones | Liadh Kelly
Proceedings of TALN 2014 (Volume 2: Short Papers)


Creating a Data Collection for Evaluating Rich Speech Retrieval
Maria Eskevich | Gareth J.F. Jones | Martha Larson | Roeland Ordelman
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We describe the development of a test collection for the investigation of speech retrieval beyond identification of relevant content. This collection focuses on satisfying user information needs for queries associated with specific types of speech acts. The collection is based on an archive of the Internet video from Internet video sharing platform (, and was provided by the MediaEval benchmarking initiative. A crowdsourcing approach was used to identify segments in the video data which contain speech acts, to create a description of the video containing the act and to generate search queries designed to refind this speech act. We describe and reflect on our experiences with crowdsourcing this test collection using the Amazon Mechanical Turk platform. We highlight the challenges of constructing this dataset, including the selection of the data source, design of the crowdsouring task and the specification of queries and relevant items.


Building a Domain-specific Document Collection for Evaluating Metadata Effects on Information Retrieval
Walid Magdy | Jinming Min | Johannes Leveling | Gareth J. F. Jones
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper describes the development of a structured document collection containing user-generated text and numerical metadata for exploring the exploitation of metadata in information retrieval (IR). The collection consists of more than 61,000 documents extracted from YouTube video pages on basketball in general and NBA (National Basketball Association) in particular, together with a set of 40 topics and their relevance judgements. In addition, a collection of nearly 250,000 user profiles related to the NBA collection is available. Several baseline IR experiments report the effect of using video-associated metadata on retrieval effectiveness. The results surprisingly show that searching the videos titles only performs significantly better than searching additional metadata text fields of the videos such as the tags or the description.


Multilingual Search for Cultural Heritage Archives via Combining Multiple Translation Resources
Gareth J. F. Jones | Ying Zhang | Eamonn Newman | Fabio Fantino | Franca Debole
Proceedings of the Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007).