Michael Stewart


2022

pdf
QuickGraph: A Rapid Annotation Tool for Knowledge Graph Extraction from Technical Text
Tyler Bikaun | Michael Stewart | Wei Liu
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations

Acquiring high-quality annotated corpora for complex multi-task information extraction (MT-IE) is an arduous and costly process for human-annotators. Adoption of unsupervised techniques for automated annotation have thus become popular. However, these techniques rely heavily on dictionaries, gazetteers, and knowledge bases. While such resources are abundant for general domains, they are scarce for specialised technical domains. To tackle this challenge, we present QuickGraph, the first collaborative MT-IE annotation tool built with indirect weak supervision and clustering to maximise annotator productivity.QuickGraph’s main contribution is a set of novel features that enable knowledge graph extraction through rapid and consistent complex multi-task entity and relation annotation. In this paper, we discuss these key features and qualitatively compare QuickGraph to existing annotation tools.

2021

pdf
LexiClean: An annotation tool for rapid multi-task lexical normalisation
Tyler Bikaun | Tim French | Melinda Hodkiewicz | Michael Stewart | Wei Liu
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

NLP systems are often challenged by difficulties arising from noisy, non-standard, and domain specific corpora. The task of lexical normalisation aims to standardise such corpora, but currently lacks suitable tools to acquire high-quality annotated data to support deep learning based approaches. In this paper, we present LexiClean, the first open-source web-based annotation tool for multi-task lexical normalisation. LexiClean’s main contribution is support for simultaneous in situ token-level modification and annotation that can be rapidly applied corpus wide. We demonstrate the usefulness of our tool through a case study on two sets of noisy corpora derived from the specialised-domain of industrial mining. We show that LexiClean allows for the rapid and efficient development of high-quality parallel corpora. A demo of our system is available at: https://youtu.be/P7_ooKrQPDU.

2019

pdf
Redcoat: A Collaborative Annotation Tool for Hierarchical Entity Typing
Michael Stewart | Wei Liu | Rachel Cardell-Oliver
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): System Demonstrations

We introduce Redcoat, a web-based annotation tool that supports collaborative hierarchical entity typing. As an annotation tool, Redcoat also facilitates knowledge elicitation by allowing the creation and continuous refinement of concept hierarchies during annotation. It aims to minimise not only annotation time but the time it takes for project creators to set up and distribute projects to annotators. Projects created using the web-based interface can be rapidly distributed to a list of email addresses. Redcoat handles the propagation of documents amongst annotators and automatically scales the annotation workload depending on the number of active annotators. In this paper we discuss these key features and outline Redcoat’s system architecture. We also highlight Redcoat’s unique benefits over existing annotation tools via a qualitative comparison.