Rob Procter


2024

pdf
CWTM: Leveraging Contextualized Word Embeddings from BERT for Neural Topic Modeling
Zheng Fang | Yulan He | Rob Procter
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Most existing topic models rely on bag-of-words (BOW) representation, which limits their ability to capture word order information and leads to challenges with out-of-vocabulary (OOV) words in new documents. Contextualized word embeddings, however, show superiority in word sense disambiguation and effectively address the OOV issue. In this work, we introduce a novel neural topic model called the Contextlized Word Topic Model (CWTM), which integrates contextualized word embeddings from BERT. The model is capable of learning the topic vector of a document without BOW information. In addition, it can also derive the topic vectors for individual words within a document based on their contextualized word embeddings. Experiments across various datasets show that CWTM generates more coherent and meaningful topics compared to existing topic models, while also accommodating unseen words in newly encountered documents.

2023

pdf
A User-Centered, Interactive, Human-in-the-Loop Topic Modelling System
Zheng Fang | Lama Alqazlan | Du Liu | Yulan He | Rob Procter
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

Human-in-the-loop topic modelling incorporates users’ knowledge into the modelling process, enabling them to refine the model iteratively. Recent research has demonstrated the value of user feedback, but there are still issues to consider, such as the difficulty in tracking changes, comparing different models and the lack of evaluation based on real-world examples of use. We developed a novel, interactive human-in-the-loop topic modeling system with a user-friendly interface that enables users compare and record every step they take, and a novel topic words suggestion feature to help users provide feedback that is faithful to the ground truth. Our system also supports not only what traditional topic models can do, i.e., learning the topics from the whole corpus, but also targeted topic modelling, i.e., learning topics for specific aspects of the corpus. In this article, we provide an overview of the system and present the results of a series of user studies designed to assess the value of the system in progressively more realistic applications of topic modelling.

pdf
PANACEA: An Automated Misinformation Detection System on COVID-19
Runcong Zhao | Miguel Arana-catania | Lixing Zhu | Elena Kochkina | Lin Gui | Arkaitz Zubiaga | Rob Procter | Maria Liakata | Yulan He
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In this demo, we introduce a web-based misinformation detection system PANACEA on COVID-19 related claims, which has two modules, fact-checking and rumour detection. Our fact-checking module, which is supported by novel natural language inference methods with a self-attention network, outperforms state-of-the-art approaches. It is also able to give automated veracity assessment and ranked supporting evidence with the stance towards the claim to be checked. In addition, PANACEA adapts the bi-directional graph convolutional networks model, which is able to detect rumours based on comment networks of related tweets, instead of relying on the knowledge base. This rumour detection module assists by warning the users in the early stages when a knowledge base may not be available.

2022

pdf
Template-based Abstractive Microblog Opinion Summarization
Iman Munire Bilal | Bo Wang | Adam Tsakalidis | Dong Nguyen | Rob Procter | Maria Liakata
Transactions of the Association for Computational Linguistics, Volume 10

We introduce the task of microblog opinion summarization (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarization dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarizing news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favors extractive summarization models. To showcase the dataset’s utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarization models and achieve good performance, with the former outperforming the latter. We also show that fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.

pdf
Unsupervised Opinion Summarisation in the Wasserstein Space
Jiayu Song | Iman Munire Bilal | Adam Tsakalidis | Rob Procter | Maria Liakata
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Opinion summarisation synthesises opinions expressed in a group of documents discussingthe same topic to produce a single summary. Recent work has looked at opinion summarisation of clusters of social media posts. Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus on opinion summarisation. To address these challenges we present WassOS, an unsupervised abstractive summarization model which makesuse of the Wasserstein distance. A Variational Autoencoder is first used to obtain the distribution of documents/posts, and the summary distribution is obtained as the Wasserstein barycenter. We create separate disentangled latent semantic and syntactic representations of the summary, which are fed into a GRU decoder with a transformer layer to produce the final summary. Our experiments onmultiple datasets including reviews, Twitter clusters and Reddit threads show that WassOSalmost always outperforms the state-of-the-art on ROUGE metrics and consistently producesthe best summaries with respect to meaning preservation according to human evaluations.

pdf
A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering
Matt Maufe | James Ravenscroft | Rob Procter | Maria Liakata
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Question Answering (QA) is a growing area of research, often used to facilitate the extraction of information from within documents. State-of-the-art QA models are usually pre-trained on domain-general corpora like Wikipedia and thus tend to struggle on out-of-domain documents without fine-tuning. We demonstrate that synthetic domain-specific datasets can be generated easily using domain-general models, while still providing significant improvements to QA performance. We present two new tools for this task: A flexible pipeline for validating the synthetic QA data and training down stream models on it, and an online interface to facilitate human annotation of this generated data. Using this interface, crowdworkers labelled 1117 synthetic QA pairs, which we then used to fine-tune downstream models and improve domain-specific QA performance by 8.75 F1.

2021

pdf
A Query-Driven Topic Model
Zheng Fang | Yulan He | Rob Procter
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf
Using Computational Grounded Theory to Understand Tutors’ Experiences in the Gig Economy
Lama Alqazlan | Rob Procter | Michael Castelle
Proceedings of the Workshop on Natural Language Processing for Digital Humanities

The introduction of online marketplace platforms has led to the advent of new forms of flexible, on-demand (or ‘gig’) work. Yet, most prior research concerning the experience of gig workers examines delivery or crowdsourcing platforms, while the experience of the large numbers of workers who undertake educational labour in the form of tutoring gigs remains understudied. To address this, we use a computational grounded theory approach to analyse tutors’ discussions on Reddit. This approach consists of three phases including data exploration, modelling and human-centred interpretation. We use both validation and human evaluation to increase the trustworthiness and reliability of the computational methods. This paper is a work in progress and reports on the first of the three phases of this approach.

pdf
Evaluation of Abstractive Summarisation Models with Machine Translation in Deliberative Processes
Miguel Arana-Catania | Rob Procter | Yulan He | Maria Liakata
Proceedings of the Third Workshop on New Frontiers in Summarization

We present work on summarising deliberative processes for non-English languages. Unlike commonly studied datasets, such as news articles, this deliberation dataset reflects difficulties of combining multiple narratives, mostly of poor grammatical quality, in a single text. We report an extensive evaluation of a wide range of abstractive summarisation models in combination with an off-the-shelf machine translation model. Texts are translated into English, summarised, and translated back to the original language. We obtain promising results regarding the fluency, consistency and relevance of the summaries produced. Our approach is easy to implement for many languages for production purposes by simply changing the translation model.

pdf
Evaluation of Thematic Coherence in Microblogs
Iman Munire Bilal | Bo Wang | Maria Liakata | Rob Procter | Adam Tsakalidis
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Collecting together microblogs representing opinions about the same topics within the same timeframe is useful to a number of different tasks and practitioners. A major question is how to evaluate the quality of such thematic clusters. Here we create a corpus of microblog clusters from three different domains and time windows and define the task of evaluating thematic coherence. We provide annotation guidelines and human annotations of thematic coherence by journalist experts. We subsequently investigate the efficacy of different automated evaluation metrics for the task. We consider a range of metrics including surface level metrics, ones for topic model coherence and text generation metrics (TGMs). While surface level metrics perform well, outperforming topic coherence metrics, they are not as consistent as TGMs. TGMs are more reliable than all other metrics considered for capturing thematic coherence in microblog clusters due to being less sensitive to the effect of time windows.

2017

pdf
TDParse: Multi-target-specific sentiment recognition on Twitter
Bo Wang | Maria Liakata | Arkaitz Zubiaga | Rob Procter
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Existing target-specific sentiment recognition methods consider only a single target per tweet, and have been shown to miss nearly half of the actual targets mentioned. We present a corpus of UK election tweets, with an average of 3.09 entities per tweet and more than one type of sentiment in half of the tweets. This requires a method for multi-target specific sentiment recognition, which we develop by using the context around a target as well as syntactic dependencies involving the target. We present results of our method on both a benchmark corpus of single targets and the multi-target election corpus, showing state-of-the art performance in both corpora and outperforming previous approaches to multi-target sentiment task as well as deep learning models for single-target sentiment.

pdf
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours
Leon Derczynski | Kalina Bontcheva | Maria Liakata | Rob Procter | Geraldine Wong Sak Hoi | Arkaitz Zubiaga
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

Media is full of false claims. Even Oxford Dictionaries named “post-truth” as the word of 2016. This makes it more important than ever to build systems that can identify the veracity of a story, and the nature of the discourse around it. RumourEval is a SemEval shared task that aims to identify and handle rumours and reactions to them, in text. We present an annotation scheme, a large dataset covering multiple topics – each having their own families of claims and replies – and use these to pose two concrete challenges as well as the results achieved by participants on these challenges.

pdf
TOTEMSS: Topic-based, Temporal Sentiment Summarisation for Twitter
Bo Wang | Maria Liakata | Adam Tsakalidis | Spiros Georgakopoulos Kolaitis | Symeon Papadopoulos | Lazaros Apostolidis | Arkaitz Zubiaga | Rob Procter | Yiannis Kompatsiaris
Proceedings of the IJCNLP 2017, System Demonstrations

We present a system for time sensitive, topic based summarisation of the sentiment around target entities and topics in collections of tweets. We describe the main elements of the system and illustrate its functionality with two examples of sentiment analysis of topics related to the 2017 UK general election.

2016

pdf
Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations
Arkaitz Zubiaga | Elena Kochkina | Maria Liakata | Rob Procter | Michal Lukasik
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

Rumour stance classification, the task that determines if each tweet in a collection discussing a rumour is supporting, denying, questioning or simply commenting on the rumour, has been attracting substantial interest. Here we introduce a novel approach that makes use of the sequence of transitions observed in tree-structured conversation threads in Twitter. The conversation threads are formed by harvesting users’ replies to one another, which results in a nested tree-like structure. Previous work addressing the stance classification task has treated each tweet as a separate unit. Here we analyse tweets by virtue of their position in a sequence and test two sequential classifiers, Linear-Chain CRF and Tree CRF, each of which makes different assumptions about the conversational structure. We experiment with eight Twitter datasets, collected during breaking news, and show that exploiting the sequential structure of Twitter conversations achieves significant improvements over the non-sequential methods. Our work is the first to model Twitter conversations as a tree structure in this manner, introducing a novel way of tackling NLP tasks on Twitter conversations.

2015

pdf
WarwickDCS: From Phrase-Based to Target-Specific Sentiment Recognition
Richard Townsend | Adam Tsakalidis | Yiwei Zhou | Bo Wang | Maria Liakata | Arkaitz Zubiaga | Alexandra Cristea | Rob Procter
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2014

pdf
University_of_Warwick: SENTIADAPTRON - A Domain Adaptable Sentiment Analyser for Tweets - Meets SemEval
Richard Townsend | Aaron Kalair | Ojas Kulkarni | Rob Procter | Maria Liakata
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

2012

pdf
A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic.
William Black | Rob Procter | Steven Gray | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The analysis of a corpus of micro-blogs on the topic of the 2011 UK referendum about the Alternative Vote has been undertaken as a joint activity by text miners and social scientists. To facilitate the collaboration, the corpus and its analysis is managed in a Web-accessible framework that allows users to upload their own textual data for analysis and to manage their own text annotation resources used for analysis. The framework also allows annotations to be searched, and the analysis to be re-run after amending the analysis resources. The corpus is also doubly human-annotated stating both whether each tweet is overall positive or negative in sentiment and whether it is for or against the proposition of the referendum.

2009

pdf
ASSIST : un moteur de recherche spécialisé pour l’analyse des cadres d’expériences
Davy Weissenbacher | Elisa Pieri | Sophia Ananiadou | Brian Rea | Farida Vis | Yuwei Lin | Rob Procter | Peter Halfpenny
Actes de la 16ème conférence sur le Traitement Automatique des Langues Naturelles. Démonstrations

L’analyse qualitative des données demande au sociologue un important travail de sélection et d’interprétation des documents. Afin de faciliter ce travail, cette communauté c’est dotée d’outils informatique mais leur fonctionnalités sont encore limitées. Le projet ASSIST est une étude exploratoire pour préciser les modules de traitement automatique des langues (TAL) permettant d’assister le sociologue dans son travail d’analyse. Nous présentons le moteur de recherche réalisé et nous justifions le choix des composants de TAL intégrés au prototype.