Robert Gaizauskas

Also published as: R. Gaizauskas, Rob Gaizauskas, Robert J. Gaizauskas

2024

pdf abs
BLN600: A Parallel Corpus of Machine/Human Transcribed Nineteenth Century Newspaper Texts
Callum William Booth | Alan Thomas | Robert Gaizauskas
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

We present a publicly available corpus of nineteenth-century newspaper text focused on crime in London, derived from the Gale British Library Newspapers corpus parts 1 and 2. The corpus comprises 600 newspaper excerpts and for each excerpt contains the original source image, the machine transcription of that image as found in the BLN and a gold standard manual transcription that we have created. We envisage the corpus will be helpful for the training and development of OCR and post-OCR correction methodologies for historical newspaper machine transcription—for which there is currently a dearth of publicly available resources. In this paper, we discuss the rationale behind gathering such a corpus, the methodology used to select, process, and align the data, and the corpus’ potential utility for historians and digital humanities researchers—particularly within the realms of neural machine translation-based post-OCR correction approaches, and other natural language processing tasks that are critically affected by erroneous OCR.

pdf abs
Leveraging LLMs for Post-OCR Correction of Historical Newspapers
Alan Thomas | Robert Gaizauskas | Haiping Lu
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024

Poor OCR quality continues to be a major obstacle for humanities scholars seeking to make use of digitised primary sources such as historical newspapers. Typical approaches to post-OCR correction employ sequence-to-sequence models for a neural machine translation task, mapping erroneous OCR texts to accurate reference texts. We shift our focus towards the adaptation of generative LLMs for a prompt-based approach. By instruction-tuning Llama 2 and comparing it to a fine-tuned BART on BLN600, a parallel corpus of 19th century British newspaper articles, we demonstrate the potential of a prompt-based approach in detecting and correcting OCR errors, even with limited training data. We achieve a significant enhancement in OCR quality with Llama 2 outperforming BART, achieving a 54.51% reduction in the character error rate against BART’s 23.30%. This paves the way for future work leveraging generative LLMs to improve the accessibility and unlock the full potential of historical texts for humanities research.

2023

pdf
Obituary: Yorick Wilks
John Tait | Robert Gaizauskas | Kalina Bontcheva
Computational Linguistics, Volume 49, Issue 3 - September 2023

2022

pdf abs
Predicting the Presence of Reasoning Markers in Argumentative Text
Jonathan Clayton | Rob Gaizauskas
Proceedings of the 9th Workshop on Argument Mining

This paper proposes a novel task in Argument Mining, which we will refer to as Reasoning Marker Prediction. We reuse the popular Persuasive Essays Corpus (Stab and Gurevych, 2014). Instead of using this corpus for Argument Structure Parsing, we use a simple heuristic method to identify text spans which we can identify as reasoning markers. We propose baseline methods for predicting the presence of these reasoning markers automatically, and make a script to generate the data for the task publicly available.

pdf abs
SNuC: The Sheffield Numbers Spoken Language Corpus
Emma Barker | Jon Barker | Robert Gaizauskas | Ning Ma | Monica Lestari Paramita
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present SNuC, the first published corpus of spoken alphanumeric identifiers of the sort typically used as serial and part numbers in the manufacturing sector. The dataset contains recordings and transcriptions of over 50 native British English speakers, speaking over 13,000 multi-character alphanumeric sequences and totalling almost 20 hours of recorded speech. We describe requirements taken into account in the designing the corpus and the methodology used to construct it. We present summary statistics describing the corpus contents, as well as a preliminary investigation into errors in spoken alphanumeric identifiers. We validate the corpus by showing how it can be used to adapt a deep learning neural network based ASR system, resulting in improved recognition accuracy on the task of spoken alphanumeric identifier recognition. Finally, we discuss further potential uses for the corpus and for the tools developed to construct it.

pdf abs
A Language Modelling Approach to Quality Assessment of OCR’ed Historical Text
Callum Booth | Robert Shoemaker | Robert Gaizauskas
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We hypothesise and evaluate a language model-based approach for scoring the quality of OCR transcriptions in the British Library Newspapers (BLN) corpus parts 1 and 2, to identify the best quality OCR for use in further natural language processing tasks, with a wider view to link individual newspaper reports of crime in nineteenth-century London to the Digital Panopticon—a structured repository of criminal lives. We mitigate the absence of gold standard transcriptions of the BLN corpus by utilising a corpus of genre-adjacent texts that capture the common and legal parlance of nineteenth-century London—the Proceedings of the Old Bailey Online—with a view to rank the BLN transcriptions by their OCR quality.

pdf abs
A Pilot Study on the Collection and Computational Analysis of Linguistic Differences Amongst Men and Women in a Kuwaiti Arabic WhatsApp Dataset
Hesah Aldihan | Robert Gaizauskas | Susan Fitzmaurice
Proceedings of the Seventh Arabic Natural Language Processing Workshop (WANLP)

This study focuses on the collection and computational analysis of Kuwaiti Arabic (KA), which is considered a low resource dialect, to test different sociolinguistic hypotheses related to gendered language use. In this paper, we describe the collection and analysis of a corpus of WhatsApp Group chats with mixed gender Kuwaiti participants. This corpus, which we are making publicly available, is the first corpus of KA conversational data. We analyse different interactional and linguistic features to get insights about features that may be indicative of gender to inform the development of a gender classification system for KA in an upcoming study. Statistical analysis of our data shows that there is insufficient evidence to claim that there are significant differences amongst men and women with respect to number of turns, length of turns and number of emojis. However, qualitative analysis shows that men and women differ substantially in the types of emojis they use and in their use of lengthened words.

2021

pdf
Using Listeners’ Interpretations in Topic Classification of Song Lyrics
Varvara Papazoglou | Robert Gaizauskas
Proceedings of the 2nd Workshop on NLP for Music and Spoken Audio (NLP4MusA)

2016

pdf bib
Summarizing Multi-Party Argumentative Conversations in Reader Comment on News
Emma Barker | Robert Gaizauskas
Proceedings of the Third Workshop on Argument Mining (ArgMining2016)

pdf
The SENSEI Annotated Corpus: Human Summaries of Reader Comment Conversations in On-line News
Emma Barker | Monica Lestari Paramita | Ahmet Aker | Emina Kurtic | Mark Hepple | Robert Gaizauskas
Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
Don’t Mention the Shoe! A Learning to Rank Approach to Content Selection for Image Description Generation
Josiah Wang | Robert Gaizauskas
Proceedings of the 9th International Natural Language Generation conference

pdf abs
A Document Repository for Social Media and Speech Conversations
Adam Funk | Robert Gaizauskas | Benoit Favre
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We present a successfully implemented document repository REST service for flexible SCRUD (search, crate, read, update, delete) storage of social media conversations, using a GATE/TIPSTER-like document object model and providing a query language for document features. This software is currently being used in the SENSEI research project and will be published as open-source software before the project ends. It is, to the best of our knowledge, the first freely available, general purpose data repository to support large-scale multimodal (i.e., speech or text) conversation analytics.

pdf abs
Cross-validating Image Description Datasets and Evaluation Metrics
Josiah Wang | Robert Gaizauskas
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

The task of automatically generating sentential descriptions of image content has become increasingly popular in recent years, resulting in the development of large-scale image description datasets and the proposal of various metrics for evaluating image description generation systems. However, not much work has been done to analyse and understand both datasets and the metrics. In this paper, we propose using a leave-one-out cross validation (LOOCV) process as a means to analyse multiply annotated, human-authored image description datasets and the various evaluation metrics, i.e. evaluating one image description against other human-authored descriptions of the same image. Such an evaluation process affords various insights into the image description datasets and evaluation metrics, such as the variations of image descriptions within and across datasets and also what the metrics capture. We compute and analyse (i) human upper-bound performance; (ii) ranked correlation between metric pairs across datasets; (iii) lower-bound performance by comparing a set of descriptions describing one image to another sentence not describing that image. Interesting observations are made about the evaluation metrics and image description datasets, and we conclude that such cross-validation methods are extremely useful for assessing and gaining insights into image description datasets and evaluation metrics for image descriptions.

Automatic summarization of reader comments in on-line news is an extremely challenging task and a capability for which there is a clear need. Work to date has focussed on producing extractive summaries using well-known techniques imported from other areas of language processing. But are extractive summaries of comments what users really want? Do they support users in performing the sorts of tasks they are likely to want to perform with reader comments? In this paper we address these questions by doing three things. First, we offer a specification of one possible summary type for reader comment, based on an analysis of reader comment in terms of issues and viewpoints. Second, we define a task-based evaluation framework for reader comment summarization that allows summarization systems to be assessed in terms of how well they support users in a time-limited task of identifying issues and characterising opinion on issues in comments. Third, we describe a pilot evaluation in which we used the task-based evaluation framework to evaluate a prototype reader comment clustering and summarization system, demonstrating the viability of the evaluation framework and illustrating the sorts of insight such an evaluation affords.

2015

pdf
Temporal Relation Classification using a Model of Tense and Aspect
Leon Derczynski | Robert Gaizauskas
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf
Defining Visually Descriptive Language
Robert Gaizauskas | Josiah Wang | Arnau Ramisa
Proceedings of the Fourth Workshop on Vision and Language

pdf
Comment-to-Article Linking in the Online News Domain
Ahmet Aker | Emina Kurtic | Mark Hepple | Rob Gaizauskas | Giuseppe Di Fabbrizio
Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue

pdf
Generating Image Descriptions with Gold Standard Visual Inputs: Motivation, Evaluation and Baselines
Josiah Wang | Robert Gaizauskas
Proceedings of the 15th European Workshop on Natural Language Generation (ENLG)

pdf
Combining Geometric, Textual and Visual Features for Predicting Prepositions in Image Descriptions
Arnau Ramisa | Josiah Wang | Ying Lu | Emmanuel Dellandrea | Francesc Moreno-Noguer | Robert Gaizauskas
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf
A Hybrid Approach to Multi-document Summarization of Opinions in Reviews
Giuseppe Di Fabbrizio | Amanda Stent | Robert Gaizauskas
Proceedings of the 8th International Natural Language Generation Conference (INLG)

pdf bib
Assigning Terms to Domains by Document Classification
Robert Gaizauskas | Emma Barker | Monica Lestari Paramita | Ahmet Aker
Proceedings of the 4th International Workshop on Computational Terminology (Computerm)

pdf
A Poodle or a Dog? Evaluating Automatic Image Annotation Using Human Descriptions at Different Levels of Granularity
Josiah Wang | Fei Yan | Ahmet Aker | Robert Gaizauskas
Proceedings of the Third Workshop on Vision and Language

pdf
Graph Ranking for Collective Named Entity Disambiguation
Ayman Alhelbawy | Robert Gaizauskas
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf abs
Bootstrapping Term Extractors for Multiple Languages
Ahmet Aker | Monica Paramita | Emma Barker | Robert Gaizauskas
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Terminology extraction resources are needed for a wide range of human language technology applications, including knowledge management, information extraction, semantic search, cross-language information retrieval and automatic and assisted translation. We create a low cost method for creating terminology extraction resources for 21 non-English EU languages. Using parallel corpora and a projection method, we create a General POS Tagger for these languages. We also investigate the use of EuroVoc terms and Wikipedia corpus to automatically create term grammar for each language. Our results show that these automatically generated resources can assist term extraction process with similar performance to manually generated resources. All resources resulted in this experiment are freely available for download.

pdf abs
Bilingual dictionaries for all EU languages
Ahmet Aker | Monica Paramita | Mārcis Pinnis | Robert Gaizauskas
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Bilingual dictionaries can be automatically generated using the GIZA++ tool. However, these dictionaries contain a lot of noise, because of which the quality of outputs of tools relying on the dictionaries are negatively affected. In this work we present three different methods for cleaning noise from automatically generated bilingual dictionaries: LLR, pivot and translation based approach. We have applied these approaches on the GIZA++ dictionaries – dictionaries covering official EU languages – in order to remove noise. Our evaluation showed that all methods help to reduce noise. However, the best performance is achieved using the transliteration based approach. We provide all bilingual dictionaries (the original GIZA++ dictionaries and the cleaned ones) free for download. We also provide the cleaning tools and scripts for free download.

pdf
Collective Named Entity Disambiguation using Graph Ranking and Clique Partitioning Approaches
Ayman Alhelbawy | Robert Gaizauskas
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf
Empirical Validation of Reichenbach’s Tense Framework
Leon Derczynski | Robert Gaizauskas
Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) – Long Papers

pdf
Extracting bilingual terminologies from comparable corpora
Ahmet Aker | Monica Paramita | Rob Gaizauskas
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Temporal Signals Help Label Temporal Relations
Leon Derczynski | Robert Gaizauskas
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2012

pdf
Automatic Bilingual Phrase Extraction from Comparable Corpora
Ahmet Aker | Yang Feng | Robert Gaizauskas
Proceedings of COLING 2012: Posters

pdf abs
TIMEN: An Open Temporal Expression Normalisation Resource
Hector Llorens | Leon Derczynski | Robert Gaizauskas | Estela Saquete
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Temporal expressions are words or phrases that describe a point, duration or recurrence in time. Automatically annotating these expressions is a research goal of increasing interest. Recognising them can be achieved with minimally supervised machine learning, but interpreting them accurately (normalisation) is a complex task requiring human knowledge. In this paper, we present TIMEN, a community-driven tool for temporal expression normalisation. TIMEN is derived from current best approaches and is an independent tool, enabling easy integration in existing systems. We argue that temporal expression normalisation can only be effectively performed with a large knowledge base and set of rules. Our solution is a framework and system with which to capture this knowledge for different languages. Using both existing and newly-annotated data, we present results showing competitive performance and invite the IE community to contribute to a knowledge base in order to solve the temporal expression normalisation problem.

pdf abs
Correlation between Similarity Measures for Inter-Language Linked Wikipedia Articles
Monica Lestari Paramita | Paul Clough | Ahmet Aker | Robert Gaizauskas
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Wikipedia articles in different languages have been mined to support various tasks, such as Cross-Language Information Retrieval (CLIR) and Statistical Machine Translation (SMT). Articles on the same topic in different languages are often connected by inter-language links, which can be used to identify similar or comparable content. In this work, we investigate the correlation between similarity measures utilising language-independent and language-dependent features and respective human judgments. A collection of 800 Wikipedia pairs from 8 different language pairs were collected and judged for similarity by two assessors. We report the development of this corpus and inter-assessor agreement between judges across the languages. Results show that similarity measured using language independent features is comparable to using an approach based on translating non-English documents. In both cases the correlation with human judgments is low but also dependent upon the language pair. The results and corpus generated from this work also provide insights into the measurement of cross-language similarity.

pdf abs
A light way to collect comparable corpora from the Web
Ahmet Aker | Evangelos Kanoulas | Robert Gaizauskas
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Statistical Machine Translation (SMT) relies on the availability of rich parallel corpora. However, in the case of under-resourced languages, parallel corpora are not readily available. To overcome this problem previous work has recognized the potential of using comparable corpora as training data. The process of obtaining such data usually involves (1) downloading a separate list of documents for each language, (2) matching the documents between two languages usually by comparing the document contents, and finally (3) extracting useful data for SMT from the matched document pairs. This process requires a large amount of time and resources since a huge volume of documents needs to be downloaded to increase the chances of finding good document pairs. In this work we aim to reduce the amount of time and resources spent for tasks 1 and 2. Instead of obtaining full documents we first obtain just titles along with some meta-data such as time and date of publication. Titles can be obtained through Web Search and RSS News feed collections so that download of the full documents is not needed. We show experimentally that titles can be used to approximate the comparison between documents using full document contents.

Lack of sufficient parallel data for many languages and domains is currently one of the major obstacles to further advancement of automated translation. The ACCURAT project is addressing this issue by researching methods how to improve machine translation systems by using comparable corpora. In this paper we present tools and techniques developed in the ACCURAT project that allow additional data needed for statistical machine translation to be extracted from comparable corpora. We present methods and tools for acquisition of comparable corpora from the Web and other sources, for evaluation of the comparability of collected corpora, for multi-level alignment of comparable corpora and for extraction of lexical and terminological data for machine translation. Finally, we present initial evaluation results on the utility of collected corpora in domain-adapted machine translation and real-life applications.

pdf abs
Assessing the Comparability of News Texts
Emma Barker | Robert Gaizauskas
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Comparable news texts are frequently proposed as a potential source of alignable subsentential fragments for use in statistical machine translation systems. But can we assess just how potentially useful they will be? In this paper we first discuss a scheme for classifying news text pairs according to the degree of relatedness of the events they report and investigate how robust this classification scheme is via a multi-lingual annotation exercise. We then propose an annotation methodology, similar to that used in summarization evaluation, to allow us to identify and quantify shared content at the subsentential level in news text pairs and report a preliminary exercise to assess this method. We conclude by discussing how this works fits into a broader programme of assessing the potential utility of comparable news texts for extracting paraphrases/translational equivalents for use in language processing applications.

2010

pdf
USFD2: Annotating Temporal Expresions and TLINKs for TempEval-2
Leon Derczynski | Robert Gaizauskas
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf abs
Model Summaries for Location-related Images
Ahmet Aker | Robert Gaizauskas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

At present there is no publicly available data set to evaluate the performance of different summarization systems on the task of generating location-related extended image captions. In this paper we describe a corpus of human generated model captions in English and German. We have collected 932 model summaries in English from existing image descriptions and machine translated these summaries into German. We also performed post-editing on the translated German summaries to ensure high quality. Both English and German summaries are evaluated using a readability assessment as in DUC and TAC to assess their quality. Our model summaries performed similar to the ones reported in Dang (2005) and thus are suitable for evaluating automatic summarization systems on the task of generating image descriptions for location related images. In addition, we also investigated whether post-editing of machine-translated model summaries is necessary for automated ROUGE evaluations. We found a high correlation in ROUGE scores between post-edited and non-post-edited model summaries which indicates that the expensive process of post-editing is not necessary.

pdf abs
Analysing Temporally Annotated Corpora with CAVaT
Leon Derczynski | Robert Gaizauskas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

We present CAVaT, a tool that performs Corpus Analysis and Validation for TimeML. CAVaT is an open source, modular checking utility for statistical analysis of features specific to temporally-annotated natural language corpora. It provides reporting, highlights salient links between a variety of general and time-specific linguistic features, and also validates a temporal annotation to ensure that it is logically consistent and sufficiently annotated. Uniquely, CAVaT provides analysis specific to TimeML-annotated temporal information. TimeML is a standard for annotating temporal information in natural language text. In this paper, we present the reporting part of CAVaT, and then its error-checking ability, including the workings of several novel TimeML document verification methods. This is followed by the execution of some example tasks using the tool to show relations between times, events, signals and links. We also demonstrate inconsistencies in a TimeML corpus (TimeBank) that have been detected with CAVaT.

pdf abs
Developing Morphological Analysers for South Asian Languages: Experimenting with the Hindi and Gujarati Languages
Niraj Aswani | Robert Gaizauskas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

A considerable amount of work has been put into development of stemmers and morphological analysers. The majority of these approaches use hand-crafted suffix-replacement rules but a few try to discover such rules from corpora. While most of the approaches remove or replace suffixes, there are examples of derivational stemmers which are based on prefixes as well. In this paper we present a rule-based morphological analyser. We propose an approach that takes both prefixes as well as suffixes into account. Given a corpus and a dictionary, our method can be used to obtain a set of suffix-replacement rules for deriving an inflected words root form. We developed an approach for the Hindi language but show that the approach is portable, at least to related languages, by adapting it to the Gujarati language. Given that the entire process of developing such a ruleset is simple and fast, our approach can be used for rapid development of morphological analysers and yet it can obtain competitive results with analysers built relying on human authored rules.

pdf abs
English-Hindi Transliteration using Multiple Similarity Metrics
Niraj Aswani | Robert Gaizauskas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we present an approach to measure the transliteration similarity of English-Hindi word pairs. Our approach has two components. First we propose a bi-directional mapping between one or more characters in the Devanagari script and one or more characters in the Roman script (pronounced as in English). This allows a given Hindi word written in Devanagari to be transliterated into the Roman script and vice-versa. Second, we present an algorithm for computing a similarity measure that is a variant of Dices coefficient measure and the LCSR measure and which also takes into account the constraints needed to match English-Hindi transliterated words. Finally, by evaluating various similarity metrics individually and together under a multiple measure agreement scenario, we show that it is possible to achieve a 0.92 f-measure in identifying English-Hindi word pairs that are transliterations. In order to assess the portability of our approach to other similar languages we adapt our system to the Gujarati language.

pdf abs
Using Dialogue Corpora to Extend Information Extraction Patterns for Natural Language Understanding of Dialogue
Roberta Catizone | Alexiei Dingli | Robert Gaizauskas
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

This paper examines how Natural Language Process (NLP) resources and online dialogue corpora can be used to extend coverage of Information Extraction (IE) templates in a Spoken Dialogue system. IE templates are used as part of a Natural Language Understanding module for identifying meaning in a user utterance. The use of NLP tools in Dialogue systems is a difficult task given 1) spoken dialogue is often not well-formed and 2) there is a serious lack of dialogue data. In spite of that, we have devised a method for extending IE patterns using standard NLP tools and available dialogue corpora found on the web. In this paper, we explain our method which includes using a set of NLP modules developed using GATE (a General Architecture for Text Engineering), as well as a general purpose editing tool that we built to facilitate the IE rule creation process. Lastly, we present directions for future work in this area.

pdf
Generating Image Descriptions Using Dependency Relational Patterns
Ahmet Aker | Robert Gaizauskas
Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics

pdf
Multi-Document Summarization Using A* Search and Discriminative Learning
Ahmet Aker | Trevor Cohn | Robert Gaizauskas
Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing

2009

pdf
Disambiguation of Biomedical Abbreviations
Mark Stevenson | Yikun Guo | Abdulaziz Alamri | Robert Gaizauskas
Proceedings of the BioNLP 2009 Workshop

pdf bib
Summary Generation for Toponym-referenced Images using Object Type Language Models
Ahmet Aker | Robert Gaizauskas
Proceedings of the International Conference RANLP-2009

2008

pdf abs
ANNALIST - ANNotation ALIgnment and Scoring Tool
George Demetriou | Robert Gaizauskas | Haotian Sun | Angus Roberts
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

In this paper we describe ANNALIST (Annotation, Alignment and Scoring Tool), a scoring system for the evaluation of the output of semantic annotation systems. ANNALIST has been designed as a system that is easily extensible and configurable for different domains, data formats, and evaluation tasks. The system architecture enables data input via the use of plugins and the users can access the systems internal alignment and scoring mechanisms without the need to convert their data to a specified format. Although developed for evaluation tasks that involve the scoring of entity mentions and relations primarily, ANNALISTs generic object representation and the availability of a range of criteria for the comparison of annotations enable the system to be tailored to a variety of scoring jobs. The paper reports on results from using ANNALIST in real-world situations in comparison to other scorers which are more established in the literature. ANNALIST has been used extensively for evaluation tasks within the VIKEF (EU FP6) and CLEF (UK MRC) projects.

pdf
Acquiring Sense Tagged Examples using Relevance Feedback
Mark Stevenson | Yikun Guo | Robert Gaizauskas
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Extracting Clinical Relationships from Patient Narratives
Angus Roberts | Robert Gaizauskas | Mark Hepple
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf
Knowledge Sources for Word Sense Disambiguation of Biomedical Text
Mark Stevenson | Yinkun Guo | Robert Gaizauskas | David Martinez
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Generating Image Captions using Topic Focused Multi-document Summarization
Robert Gaizauskas
Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization

pdf
Evaluating automatically generated user-focused multi-document summaries for geo-referenced images
Ahmet Aker | Robert Gaizauskas
Coling 2008: Proceedings of the workshop Multi-source Multilingual Information Extraction and Summarization

pdf
A Data Driven Approach to Query Expansion in Question Answering
Leon Derczynski | Jun Wang | Robert Gaizauskas | Mark A. Greenwood
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

pdf
Evaluation of Automatically Reformulated Questions in Question Series
Richard Shaw | Ben Solway | Robert Gaizauskas | Mark A. Greenwood
Coling 2008: Proceedings of the 2nd workshop on Information Retrieval for Question Answering

2007

pdf
USFD: Preliminary Exploration of Features and Classifiers for the TempEval-2007 Task
Mark Hepple | Andrea Setzer | Robert Gaizauskas
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

2006

pdf abs
Language Resources for Background Gathering
Horacio Saggion | Robert Gaizauskas
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

We describe the Cubreporter information access system which allows access to news archives through the use of natural language technology. The system includes advanced text search, question answering, summarization, and entity profiling capabilities. It has been designed taking into account the characteristics of the background gathering task.

pdf abs
Simulating Cub Reporter Dialogues: The collection of naturalistic human-human dialogues for information access to text archives
Emma Barker | Ryuichiro Higashinaka | François Mairesse | Robert Gaizauskas | Marilyn Walker | Jonathan Foster
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes a dialogue data collection experiment and resulting corpus for dialogues between a senior mobile journalist and a junior cub reporter back at the office. The purpose of the dialogue is for the mobile journalist to collect background information in preparation for an interview or on-the-site coverage of a breaking story. The cub reporter has access to text archives that contain such background information. A unique aspect of these dialogues is that they capture information-seeking behavior for an open-ended task against a large unstructured data source. Initial analyses of the corpus show that the experimental design leads to real-time, mixedinitiative, highly interactive dialogues with many interesting properties.