Sophia Ananiadou

2025

pdf bib abs
RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information
Zhiwei Liu | Kailai Yang | Qianqian Xie | Christine de Kock | Sophia Ananiadou | Eduard Hovy
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Misinformation is prevalent in various fields such as education, politics, health, etc., causing significant harm to society. However, current methods for cross-domain misinformation detection rely on effort- and resource-intensive fine-tuning and complex model structures. With the outstanding performance of LLMs, many studies have employed them for misinformation detection. Unfortunately, they focus on in-domain tasks and do not incorporate significant sentiment and emotion features (which we jointly call affect). In this paper, we propose RAEmoLLM, the first retrieval augmented (RAG) LLMs framework to address cross-domain misinformation detection using in-context learning based on affective information. RAEmoLLM includes three modules. (1) In the index construction module, we apply an emotional LLM to obtain affective embeddings from all domains to construct a retrieval database. (2) The retrieval module uses the database to recommend top K examples (text-label pairs) from source domain data for target domain contents. (3) These examples are adopted as few-shot demonstrations for the inference module to process the target domain content. The RAEmoLLM can effectively enhance the general performance of LLMs in cross-domain misinformation detection tasks through affect-based retrieval, without fine-tuning. We evaluate our framework on three misinformation benchmarks. Results show that RAEmoLLM achieves significant improvements compared to the other few-shot methods on three datasets, with the highest increases of 15.64%, 31.18%, and 15.73% respectively. This project is available at https://github.com/lzw108/RAEmoLLM.

pdf bib
Proceedings of the 24th Workshop on Biomedical Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Makoto Miwa | Junichi Tsujii
Proceedings of the 24th Workshop on Biomedical Language Processing

pdf bib abs
Enhancing Stress Detection on Social Media Through Multi-Modal Fusion of Text and Synthesized Visuals
Efstathia Soufleri | Sophia Ananiadou
Proceedings of the 24th Workshop on Biomedical Language Processing

Social media platforms generate an enormous volume of multi-modal data, yet stress detection research has predominantly relied on text-based analysis. In this work, we propose a novel framework that integrates textual content with synthesized visual cues to enhance stress detection. Using the generative model DALL·E, we synthesize images from social media posts, which are then fused with text through the multi-modal capabilities of a pre-trained CLIP model. Our approach is evaluated on the Dreaddit dataset, where a classifier trained on frozen CLIP features achieves 94.90% accuracy, and full fine-tuning further improves performance to 98.41%. These results underscore the integration of synthesized visuals with textual data not only enhances stress detection but also offers a robust method over traditional text-only methods, paving the way for innovative approaches in mental health monitoring and social media analytics.

This paper presents the setup and results of the third edition of the BioLaySumm shared task on Lay Summarization of Biomedical Research Articles and Radiology Reports, hosted at the BioNLP Workshop at ACL 2025. In this task edition, we aim to build on the first two editions’ successes by further increasing research interest in this important task and encouraging participants to explore novel approaches that will help advance the state-of-the-art. Specifically, we introduce the new task of Radiology Report Generation with Layman’s terms, which is parallel to the task of lay summarization of biomedical articles in the first two editions. Overall, our results show that a broad range of innovative approaches were adopted by task participants, including inspiring explorations of latest RL techniques adopted in the training of general-domain large reasoning models.

pdf bib
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)
Sophia Ananiadou | Dina Demner-Fushman | Deepak Gupta | Paul Thompson
Proceedings of the Second Workshop on Patient-Oriented Language Processing (CL4Health)

We propose ELAINE (EngLish-jApanese-chINesE)-medLLM, a trilingual (English, Japanese, Chinese) large language model adapted for the bio-medical domain based on Llama-3-8B. The training dataset was carefully curated in terms of volume and diversity to adapt to the biomedical domain and endow trilingual capability while preserving the knowledge and abilities of the base model. The training follows 2-stage paths: continued pre-training and supervised fine-tuning (SFT). Our results demonstrate that ELAINE-medLLM exhibits superior trilingual capabilities compared to existing bilingual or multilingual medical LLMs without severely sacrificing the base model’s capability.

pdf bib abs
EMPEC: A Comprehensive Benchmark for Evaluating Large Language Models Across Diverse Healthcare Professions
Zheheng Luo | Chenhan Yuan | Qianqian Xie | Sophia Ananiadou
Findings of the Association for Computational Linguistics: ACL 2025

Recent advancements in Large Language Models (LLMs) show their potential in accurately answering biomedical questions, yet current healthcare benchmarks primarily assess knowledge mastered by medical doctors, neglecting other essential professions. To address this gap, we introduce the Examinations for Medical PErsonnel in Chinese (EMPEC), a comprehensive healthcare knowledge benchmark featuring 157,803 exam questions across 124 subjects and 20 healthcare professions, including underrepresented roles like Optometrists and Audiologists. Each question is tagged for release time and source authenticity. We evaluated 17 LLMs, including proprietary and open-source models, finding that while models like GPT-4 achieved over 75% accuracy, they struggled with specialized fields and alternative medicine. Notably, we find that most medical-specific LLMs underperform their general-purpose counterparts in EMPEC, and incorporating EMPEC’s data in fine-tuning improves performance. In addition, we tested LLMs on questions released after the completion of their training to examine their ability in unseen queries. We also translated the test set into English and simplified Chinese and analyse the impact on different models. Our findings emphasize the need for broader benchmarks to assess LLM applicability in real-world healthcare, and we will provide the dataset and evaluation toolkit for future research.

Large language models (LLMs) fine-tuned on multimodal financial data have demonstrated impressive reasoning capabilities in various financial tasks. However, they often struggle with multi-step, goal-oriented scenarios in interactive financial markets, such as trading, where complex agentic approaches are required to improve decision-making. To address this, we propose FLAG-Trader, a unified architecture integrating linguistic processing (via LLMs) with gradient-driven reinforcement learning (RL) policy optimization, in which a partially fine-tuned LLM acts as the policy network, leveraging pre-trained knowledge while adapting to the financial domain through parameter-efficient fine-tuning. Through policy gradient optimization driven by trading rewards, our framework not only enhances LLM performance in trading but also improves results on other financial-domain tasks. We present extensive empirical evidence to validate these enhancements.

User reviews on e-commerce platforms exhibit dynamic sentiment patterns driven by temporal and contextual factors. Traditional sentiment analysis methods focus on static reviews, failing to capture the evolving temporal relationship between user sentiment rating and textual content. Sentiment analysis on streaming reviews addresses this limitation by modeling and predicting the temporal evolution of user sentiments. However, it suffers from data sparsity, manifesting in temporal, spatial, and combined forms. In this paper, we introduce SynGraph, a novel framework designed to address data sparsity in sentiment analysis on streaming reviews. SynGraph alleviates data sparsity by categorizing users into mid-tail, long-tail, and extreme scenarios and incorporating LLM-augmented enhancements within a dynamic graph-based structure. Experiments on real-world datasets demonstrate its effectiveness in addressing sparsity and improving sentiment modeling in streaming reviews.

pdf bib
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Chung-Chi Chen | Antonio Moreno-Sandoval | Jimin Huang | Qianqian Xie | Sophia Ananiadou | Hsin-Hsi Chen
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)

Despite the promise of large language models (LLMs) in finance, their capabilities for financial misinformation detection (FMD) remain largely unexplored. To evaluate the capabilities of LLMs in FMD task, we introduce the financial misinformation detection shared task featured at COLING FinNLP-FNP-LLMFinLegal-2024, FMD Challenge. This challenge aims to evaluate the ability of LLMs to verify financial misinformation while generating plausible explanations. In this paper, we provide an overview of this task and dataset, summarize participants’ methods, and present their experimental evaluations, highlighting the effectiveness of LLMs in addressing the FMD task. To the best of our knowledge, the FMD Challenge is one of the first challenges for assessing LLMs in the field of FMD. Therefore, we provide detailed observations and draw conclusions for the future development of this field.

2024

pdf bib
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Makoto Miwa | Kirk Roberts | Junichi Tsujii
Proceedings of the 23rd Workshop on Biomedical Natural Language Processing

pdf bib
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024
Dina Demner-Fushman | Sophia Ananiadou | Paul Thompson | Brian Ondov
Proceedings of the First Workshop on Patient-Oriented Language Processing (CL4Health) @ LREC-COLING 2024

pdf bib abs
Neuron-Level Knowledge Attribution in Large Language Models
Zeping Yu | Sophia Ananiadou
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Identifying important neurons for final predictions is essential for understanding the mechanisms of large language models. Due to computational constraints, current attribution techniques struggle to operate at neuron level. In this paper, we propose a static method for pinpointing significant neurons. Compared to seven other methods, our approach demonstrates superior performance across three metrics. Additionally, since most static methods typically only identify “value neurons” directly contributing to the final prediction, we propose a method for identifying “query neurons” which activate these “value neurons”. Finally, we apply our methods to analyze six types of knowledge across both attention and feed-forward network (FFN) layers. Our method and analysis are helpful for understanding the mechanisms of knowledge storage and set the stage for future research in knowledge editing. The code is available on https://github.com/zepingyu0512/neuron-attribution.

pdf bib abs
How do Large Language Models Learn In-Context? Query and Key Matrices of In-Context Heads are Two Towers for Metric Learning
Zeping Yu | Sophia Ananiadou
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

We investigate the mechanism of in-context learning (ICL) on sentence classification tasks with semantically-unrelated labels (“foo”/“bar”). We find intervening in only 1% heads (named “in-context heads”) significantly affects ICL accuracy from 87.6% to 24.4%. To understand this phenomenon, we analyze the value-output vectors in these heads and discover that the vectors at each label position contain substantial information about the corresponding labels. Furthermore, we observe that the prediction shift from “foo” to “bar” is due to the respective reduction and increase in these heads’ attention scores at “foo” and “bar” positions. Therefore, we propose a hypothesis for ICL: in in-context heads, the value-output matrices extract label features, while the query-key matrices compute the similarity between the features at the last position and those at each label position. The query and key matrices can be considered as two towers that learn the similarity metric between the last position’s features and each demonstration at label positions. Using this hypothesis, we explain the majority label bias and recency bias in ICL and propose two methods to reduce these biases by 22% and 17%, respectively.

pdf bib abs
Interpreting Arithmetic Mechanism in Large Language Models through Comparative Neuron Analysis
Zeping Yu | Sophia Ananiadou
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

We find arithmetic ability resides within a limited number of attention heads, with each head specializing in distinct operations. To delve into the reason, we introduce the Comparative Neuron Analysis (CNA) method, which identifies an internal logic chain consisting of four distinct stages from input to prediction: feature enhancing with shallow FFN neurons, feature transferring by shallow attention layers, feature predicting by arithmetic heads, and prediction enhancing among deep FFN neurons. Moreover, we identify the human-interpretable FFN neurons within both feature-enhancing and feature-predicting stages. These findings lead us to investigate the mechanism of LoRA, revealing that it enhances prediction probabilities by amplifying the coefficient scores of FFN neurons related to predictions. Finally, we apply our method in model pruning for arithmetic tasks and model editing for reducing gender bias. Code is on https://github.com/zepingyu0512/arithmetic-mechanism.

pdf bib abs
LongDocFACTScore: Evaluating the Factuality of Long Document Abstractive Summarisation
Jennifer A. Bishop | Sophia Ananiadou | Qianqian Xie
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Maintaining factual consistency is a critical issue in abstractive text summarisation, however, it cannot be assessed by traditional automatic metrics used for evaluating text summarisation, such as ROUGE scoring. Recent efforts have been devoted to developing improved metrics for measuring factual consistency using pre-trained language models, but these metrics have restrictive token limits, and are therefore not suitable for evaluating long document text summarisation. Moreover, there is limited research and resources available for evaluating whether existing automatic evaluation metrics are fit for purpose when applied in long document settings. In this work, we evaluate the efficacy of automatic metrics for assessing the factual consistency of long document text summarisation. We create a human-annotated data set for evaluating automatic factuality metrics, LongSciVerify, which contains fine-grained factual consistency annotations for long document summaries from the scientific domain. We also propose a new evaluation framework, LongDocFACTScore, which is suitable for evaluating long document summarisation. This framework allows metrics to be efficiently extended to any length document and outperforms existing state-of-the-art metrics in its ability to correlate with human measures of factuality when used to evaluate long document summarisation data sets. We make our code and LongSciVerify data set publicly available: https://github.com/jbshp/LongDocFACTScore.

2023

pdf bib
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks
Dina Demner-fushman | Sophia Ananiadou | Kevin Cohen
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

pdf bib abs
Gaussian Distributed Prototypical Network for Few-shot Genomic Variant Detection
Jiarun Cao | Niels Peek | Andrew Renehan | Sophia Ananiadou
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Automatically identifying genetic mutations in the cancer literature using text mining technology has been an important way to study the vast amount of cancer medical literature. However, novel knowledge regarding the genetic variants proliferates rapidly, though current supervised learning models struggle with discovering these unknown entity types. Few-shot learning allows a model to perform effectively with great generalization on new entity types, which has not been explored in recognizing cancer mutation detection. This paper addresses cancer mutation detection tasks with few-shot learning paradigms. We propose GDPN framework, which models the label dependency from the training examples in the support set and approximates the transition scores via Gaussian distribution. The experiments on three benchmark cancer mutation datasets show the effectiveness of our proposed model.

pdf bib abs
Zero-shot Temporal Relation Extraction with ChatGPT
Chenhan Yuan | Qianqian Xie | Sophia Ananiadou
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

The goal of temporal relation extraction is to infer the temporal relation between two events in the document. Supervised models are dominant in this task. In this work, we investigate ChatGPT’s ability on zero-shot temporal relation extraction. We designed three different prompt techniques to break down the task and evaluate ChatGPT. Our experiments show that ChatGPT’s performance has a large gap with that of supervised methods and can heavily rely on the design of prompts. We further demonstrate that ChatGPT can infer more small relation classes correctly than supervised methods. The current shortcomings of ChatGPT on temporal relation extraction are also discussed in this paper. We found that ChatGPT cannot keep consistency during temporal inference and it fails in actively long-dependency temporal inference.

pdf bib abs
Sentiment-guided Transformer with Severity-aware Contrastive Learning for Depression Detection on Social Media
Tianlin Zhang | Kailai Yang | Sophia Ananiadou
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

Early identification of depression is beneficial to public health surveillance and disease treatment. There are many models that mainly treat the detection as a binary classification task, such as detecting whether a user is depressed. However, identifying users’ depression severity levels from posts on social media is more clinically useful for future prevention and treatment. Existing severity detection methods mainly model the semantic information of posts while ignoring the relevant sentiment information, which can reflect the user’s state of mind and could be helpful for severity detection. In addition, they treat all severity levels equally, making the model difficult to distinguish between closely-labeled categories. We propose a sentiment-guided Transformer model, which efficiently fuses social media posts’ semantic information with sentiment information. Furthermore, we also utilize a supervised severity-aware contrastive learning framework to enable the model to better distinguish between different severity levels. The experimental results show that our model achieves superior performance on two public datasets, while further analysis proves the effectiveness of all proposed modules.

pdf bib abs
DISTANT: Distantly Supervised Entity Span Detection and Classification
Ken Yano | Makoto Miwa | Sophia Ananiadou
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

We propose a distantly supervised pipeline NER which executes entity span detection and entity classification in sequence named DISTANT (DIstantly Supervised enTity spAN deTection and classification).The former entity span detector extracts possible entity mention spans by the distant supervision. Then the later entity classifier assigns each entity span to one of the positive entity types or none by employing a positive and unlabeled (PU) learning framework. Two models were built based on the pre-trained SciBERT model and fine-tuned with the silver corpus generated by the distant supervision. Experimental results on BC5CDR and NCBI-Disease datasets show that our method outperforms the end-to-end NER baselines without PU learning by a large margin. In particular, it increases the recall score effectively.

pdf bib abs
Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of Biomedical Research Articles
Tomas Goldsack | Zheheng Luo | Qianqian Xie | Carolina Scarton | Matthew Shardlow | Sophia Ananiadou | Chenghua Lin
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks

This paper presents the results of the shared task on Lay Summarisation of Biomedical Research Articles (BioLaySumm), hosted at the BioNLP Workshop at ACL 2023. The goal of this shared task is to develop abstractive summarisation models capable of generating “lay summaries” (i.e., summaries that are comprehensible to non-technical audiences) in both a controllable and non-controllable setting. There are two subtasks: 1) Lay Summarisation, where the goal is for participants to build models for lay summary generation only, given the full article text and the corresponding abstract as input; and2) Readability-controlled Summarisation, where the goal is for participants to train models to generate both the technical abstract and the lay summary, given an article’s main text as input. In addition to overall results, we report on the setup and insights from the BioLaySumm shared task, which attracted a total of 20 participating teams across both subtasks.

pdf bib abs
Entity Coreference and Co-occurrence Aware Argument Mining from Biomedical Literature
Boyang Liu | Viktor Schlegel | Riza Batista-navarro | Sophia Ananiadou
Proceedings of the 4th Workshop on Computational Approaches to Discourse (CODI 2023)

Biomedical argument mining (BAM) aims at automatically identifying the argumentative structure in biomedical texts. However, identifying and classifying argumentative relations (AR) between argumentative components (AC) is challenging since it not only needs to understand the semantics of ACs but also need to capture the interactions between them. We argue that entities can serve as bridges that connect different ACs since entities and their mentions convey significant semantic information in biomedical argumentation. For example, it is common that related AC pairs share a common entity. Capturing such entity information can be beneficial for the Relation Identification (RI) task. In order to incorporate this entity information into BAM, we propose an Entity Coreference and Co-occurrence aware Argument Mining (ECCAM) framework based on an edge-oriented graph model for BAM. We evaluate our model on a benchmark dataset and from the experimental results we find that our method improves upon state-of-the-art methods.

pdf bib abs
Span-based Named Entity Recognition by Generating and Compressing Information
Nhung T. H. Nguyen | Makoto Miwa | Sophia Ananiadou
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics

The information bottleneck (IB) principle has been proven effective in various NLP applications. The existing work, however, only used either generative or information compression models to improve the performance of the target task. In this paper, we propose to combine the two types of IB models into one system to enhance Named Entity Recognition (NER).For one type of IB model, we incorporate two unsupervised generative components, span reconstruction and synonym generation, into a span-based NER system. The span reconstruction ensures that the contextualised span representation keeps the span information, while the synonym generation makes synonyms have similar representations even in different contexts. For the other type of IB model, we add a supervised IB layer that performs information compression into the system to preserve useful features for NER in the resulting span representations. Experiments on five different corpora indicate that jointly training both generative and information compression models can enhance the performance of the baseline span-based NER system. Our source code is publicly available at https://github.com/nguyennth/joint-ib-models.

pdf bib abs
Towards Interpretable Mental Health Analysis with Large Language Models
Kailai Yang | Shaoxiong Ji | Tianlin Zhang | Qianqian Xie | Ziyan Kuang | Sophia Ananiadou
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

The latest large language models (LLMs) such as ChatGPT, exhibit strong capabilities in automated mental health analysis. However, existing relevant studies bear several limitations, including inadequate evaluations, lack of prompting strategies, and ignorance of exploring LLMs for explainability. To bridge these gaps, we comprehensively evaluate the mental health analysis and emotional reasoning ability of LLMs on 11 datasets across 5 tasks. We explore the effects of different prompting strategies with unsupervised and distantly supervised emotional information. Based on these prompts, we explore LLMs for interpretable mental health analysis by instructing them to generate explanations for each of their decisions. We convey strict human evaluations to assess the quality of the generated explanations, leading to a novel dataset with 163 human-assessed explanations. We benchmark existing automatic evaluation metrics on this dataset to guide future related works. According to the results, ChatGPT shows strong in-context learning ability but still has a significant gap with advanced task-specific methods. Careful prompt engineering with emotional cues and expert-written few-shot examples can also effectively improve performance on mental health analysis. In addition, ChatGPT generates explanations that approach human performance, showing its great potential in explainable mental health analysis.

pdf bib abs
Argument mining as a multi-hop generative machine reading comprehension task
Boyang Liu | Viktor Schlegel | Riza Batista-Navarro | Sophia Ananiadou
Findings of the Association for Computational Linguistics: EMNLP 2023

Argument mining (AM) is a natural language processing task that aims to generate an argumentative graph given an unstructured argumentative text. An argumentative graph that consists of argumentative components and argumentative relations contains completed information of an argument and exhibits the logic of an argument. As the argument structure of an argumentative text can be regarded as an answer to a “why” question, the whole argument structure is therefore similar to the “chain of thought” concept, i.e., the sequence of ideas that lead to a specific conclusion for a given argument (Wei et al., 2022). For argumentative texts in the same specific genre, the “chain of thought” of such texts is usually similar, i.e., in a student essay, there is usually a major claim supported by several claims, and then a number of premises which are related to the claims are included (Eger et al., 2017). In this paper, we propose a new perspective which transfers the argument mining task into a multi-hop reading comprehension task, allowing the model to learn the argument structure as a “chain of thought”. We perform a comprehensive evaluation of our approach on two AM benchmarks and find that we surpass SOTA results. A detailed analysis shows that specifically the “chain of thought” information is helpful for the argument mining task.

pdf bib abs
Document-level Text Simplification with Coherence Evaluation
Laura Vásquez-Rodríguez | Matthew Shardlow | Piotr Przybyła | Sophia Ananiadou
Proceedings of the Second Workshop on Text Simplification, Accessibility and Readability

We present a coherence-aware evaluation of document-level Text Simplification (TS), an approach that has not been considered in TS so far. We improve current TS sentence-based models to support a multi-sentence setting and the implementation of a state-of-the-art neural coherence model for simplification quality assessment. We enhanced English sentence simplification neural models for document-level simplification using 136,113 paragraph-level samples from both the general and medical domains to generate multiple sentences. Additionally, we use document-level simplification, readability and coherence metrics for evaluation. Our contributions include the introduction of coherence assessment into simplification evaluation with the automatic evaluation of 34,052 simplifications, a fine-tuned state-of-the-art model for document-level simplification, a coherence-based analysis of our results and a human evaluation of 300 samples that demonstrates the challenges encountered when moving towards document-level simplification.

pdf bib abs
PESTO: A Post-User Fusion Network for Rumour Detection on Social Media
Erxue Min | Sophia Ananiadou
Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis

Rumour detection on social media is an important topic due to the challenges of misinformation propagation and slow verification of misleading information. Most previous work focus on the response posts on social media, ignoring the useful characteristics of involved users and their relations. In this paper, we propose a novel framework, Post-User Fusion Network (PESTO), which models the patterns of rumours from both post diffusion and user social networks. Specifically, we propose a novel Chronologically-masked Transformer architecture to model both temporal sequence and diffusion structure of rumours, and apply a Relational Graph Convolutional Network to model the social relations of involved users, with a fusion network based on self-attention mechanism to incorporate the two aspects. Additionally, two data augmentation techniques are leveraged to improve the robustness and accuracy of our models. Empirical results on four datasets of English tweets show the superiority of the proposed method.

2022

pdf bib abs
Learning Disentangled Representations of Negation and Uncertainty
Jake Vasilakes | Chrysoula Zerva | Makoto Miwa | Sophia Ananiadou
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Negation and uncertainty modeling are long-standing tasks in natural language processing. Linguistic theory postulates that expressions of negation and uncertainty are semantically independent from each other and the content they modify. However, previous works on representation learning do not explicitly model this independence. We therefore attempt to disentangle the representations of negation, uncertainty, and content using a Variational Autoencoder. We find that simply supervising the latent representations results in good disentanglement, but auxiliary objectives based on adversarial learning and mutual information minimization can provide additional disentanglement gains.

pdf bib
Proceedings of the 21st Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 21st Workshop on Biomedical Language Processing

pdf bib abs
Named Entity Recognition for Cancer Immunology Research Using Distant Supervision
Hai-Long Trieu | Makoto Miwa | Sophia Ananiadou
Proceedings of the 21st Workshop on Biomedical Language Processing

Cancer immunology research involves several important cell and protein factors. Extracting the information of such cells and proteins and the interactions between them from text are crucial in text mining for cancer immunology research. However, there are few available datasets for these entities, and the amount of annotated documents is not sufficient compared with other major named entity types. In this work, we introduce our automatically annotated dataset of key named entities, i.e., T-cells, cytokines, and transcription factors, which engages the recent cancer immunotherapy. The entities are annotated based on the UniProtKB knowledge base using dictionary matching. We build a neural named entity recognition (NER) model to be trained on this dataset and evaluate it on a manually-annotated data. Experimental results show that we can achieve a promising NER performance even though our data is automatically annotated. Our dataset also enhances the NER performance when combined with existing data, especially gaining improvement in yet investigated named entities such as cytokines and transcription factors.

pdf bib abs
GenCompareSum: a hybrid unsupervised summarization method using salience
Jennifer Bishop | Qianqian Xie | Sophia Ananiadou
Proceedings of the 21st Workshop on Biomedical Language Processing

Text summarization (TS) is an important NLP task. Pre-trained Language Models (PLMs) have been used to improve the performance of TS. However, PLMs are limited by their need of labelled training data and by their attention mechanism, which often makes them unsuitable for use on long documents. To this end, we propose a hybrid, unsupervised, abstractive-extractive approach, in which we walk through a document, generating salient textual fragments representing its key points. We then select the most important sentences of the document by choosing the most similar sentences to the generated texts, calculated using BERTScore. We evaluate the efficacy of generating and using salient textual fragments to guide extractive summarization on documents from the biomedical and general scientific domains. We compare the performance between long and short documents using different generative text models, which are finetuned to generate relevant queries or document titles. We show that our hybrid approach out-performs existing unsupervised methods, as well as state-of-the-art supervised methods, despite not needing a vast amount of labelled training data.

pdf bib abs
GRETEL: Graph Contrastive Topic Enhanced Language Model for Long Document Extractive Summarization
Qianqian Xie | Jimin Huang | Tulika Saha | Sophia Ananiadou
Proceedings of the 29th International Conference on Computational Linguistics

Recently, neural topic models (NTMs) have been incorporated into pre-trained language models (PLMs), to capture the global semantic information for text summarization. However, in these methods, there remain limitations in the way they capture and integrate the global semantic information. In this paper, we propose a novel model, the graph contrastive topic enhanced language model (GRETEL), that incorporates the graph contrastive topic model with the pre-trained language model, to fully leverage both the global and local contextual semantics for long document extractive summarization. To better capture and incorporate the global semantic information into PLMs, the graph contrastive topic model integrates the hierarchical transformer encoder and the graph contrastive learning to fuse the semantic information from the global document context and the gold summary. To this end, GRETEL encourages the model to efficiently extract salient sentences that are topically related to the gold summary, rather than redundant sentences that cover sub-optimal topics. Experimental results on both general domain and biomedical datasets demonstrate that our proposed method outperforms SOTA methods.

pdf bib abs
Readability Controllable Biomedical Document Summarization
Zheheng Luo | Qianqian Xie | Sophia Ananiadou
Findings of the Association for Computational Linguistics: EMNLP 2022

Different from general documents, it is recognised that the ease with which people can understand a biomedical text is eminently varied, owing to the highly technical nature of biomedical documents and the variance of readers’ domain knowledge. However, existing biomedical document summarization systems have paid little attention to readability control, leaving users with summaries that are incompatible with their levels of expertise.In recognition of this urgent demand, we introduce a new task of readability controllable summarization for biomedical documents, which aims to recognise users’ readability demands and generate summaries that better suit their needs: technical summaries for experts and plain language summaries (PLS) for laymen.To establish this task, we construct a corpus consisting of biomedical papers with technical summaries and PLSs written by the authors, and benchmark multiple advanced controllable abstractive and extractive summarization models based on pre-trained language models (PLMs) with prevalent controlling and generation techniques.Moreover, we propose a novel masked language model (MLM) based metric and its variant to effectively evaluate the readability discrepancy between lay and technical summaries.Experimental results from automated and human evaluations show that though current control techniques allow for a certain degree of readability adjustment during generation, the performance of existing controllable summarization methods is far from desirable in this task.

pdf bib abs
Text Classification and Prediction in the Legal Domain
Minh-Quoc Nghiem | Paul Baylis | André Freitas | Sophia Ananiadou
Proceedings of the Thirteenth Language Resources and Evaluation Conference

We present a case study on the application of text classification and legal judgment prediction for flight compensation. We combine transformer-based classification models to classify responses from airlines and incorporate text data with other data types to predict a legal claim being successful. Our experimental evaluations show that our models achieve consistent and significant improvements over baselines and even outperformed human prediction when predicting a claim being successful. These models were integrated into an existing claim management system, providing substantial productivity gains for handling the case lifecycle, currently supporting several thousands of monthly processes.

pdf bib abs
Incorporating Zoning Information into Argument Mining from Biomedical Literature
Boyang Liu | Viktor Schlegel | Riza Batista-Navarro | Sophia Ananiadou
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The goal of text zoning is to segment a text into zones (i.e., Background, Conclusion) that serve distinct functions. Argumentative zoning, a specific text zoning scheme for the scientific domain, is considered as the antecedent for argument mining by many researchers. Surprisingly, however, little work is concerned with exploiting zoning information to improve the performance of argument mining models, despite the relatedness of the two tasks. In this paper, we propose two transformer-based models to incorporate zoning information into argumentative component identification and classification tasks. One model is for the sentence-level argument mining task and the other is for the token-level task. In particular, we add the zoning labels predicted by an off-the-shelf model to the beginning of each sentence, inspired by the convention commonly used biomedical abstracts. Moreover, we employ multi-head attention to transfer the sentence-level zoning information to each token in a sentence. Based on experiment results, we find a significant improvement in F1-scores for both sentence- and token-level tasks. It is worth mentioning that these zoning labels can be obtained with high accuracy by utilising readily available automated methods. Thus, existing argument mining models can be improved by incorporating zoning information without any additional annotation cost.

pdf bib abs
RELATE: Generating a linguistically inspired Knowledge Graph for fine-grained emotion classification
Annika Marie Schoene | Nina Dethlefs | Sophia Ananiadou
Proceedings of the Thirteenth Language Resources and Evaluation Conference

Several existing resources are available for sentiment analysis (SA) tasks that are used for learning sentiment specific embedding (SSE) representations. These resources are either large, common-sense knowledge graphs (KG) that cover a limited amount of polarities/emotions or they are smaller in size (e.g.: lexicons), which require costly human annotation and cover fine-grained emotions. Therefore using knowledge resources to learn SSE representations is either limited by the low coverage of polarities/emotions or the overall size of a resource. In this paper, we first introduce a new directed KG called ‘RELATE’, which is built to overcome both the issue of low coverage of emotions and the issue of scalability. RELATE is the first KG of its size to cover Ekman’s six basic emotions that are directed towards entities. It is based on linguistic rules to incorporate the benefit of semantics without relying on costly human annotation. The performance of ‘RELATE’ is evaluated by learning SSE representations using a Graph Convolutional Neural Network (GCN).

pdf bib abs
UoM&MMU at TSAR-2022 Shared Task: Prompt Learning for Lexical Simplification
Laura Vásquez-Rodríguez | Nhung Nguyen | Matthew Shardlow | Sophia Ananiadou
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

We present PromptLS, a method for fine-tuning large pre-trained Language Models (LM) to perform the task of Lexical Simplification. We use a predefined template to attain appropriate replacements for a term, and fine-tune a LM using this template on language specific datasets. We filter candidate lists in post-processing to improve accuracy. We demonstrate that our model can work in a) a zero shot setting (where we only require a pre-trained LM), b) a fine-tuned setting (where language-specific data is required), and c) a multilingual setting (where the model is pre-trained across multiple languages and fine-tuned in an specific language). Experimental results show that, although the zero-shot setting is competitive, its performance is still far from the fine-tuned setting. Also, the multilingual is unsurprisingly worse than the fine-tuned model. Among all TSAR-2022 Shared Task participants, our team was ranked second in Spanish and third in English.

2021

pdf bib
Proceedings of the 20th Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 20th Workshop on Biomedical Language Processing

pdf bib abs
SpanEmo: Casting Multi-label Emotion Classification as Span-prediction
Hassan Alhuzali | Sophia Ananiadou
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume

Emotion recognition (ER) is an important task in Natural Language Processing (NLP), due to its high impact in real-world applications from health and well-being to author profiling, consumer analysis and security. Current approaches to ER, mainly classify emotions independently without considering that emotions can co-exist. Such approaches overlook potential ambiguities, in which multiple emotions overlap. We propose a new model “SpanEmo” casting multi-label emotion classification as span-prediction, which can aid ER models to learn associations between labels and words in a sentence. Furthermore, we introduce a loss function focused on modelling multiple co-existing emotions in the input sentence. Experiments performed on the SemEval2018 multi-label emotion data over three language sets (i.e., English, Arabic and Spanish) demonstrate our method’s effectiveness. Finally, we present different analyses that illustrate the benefits of our method in terms of improving the model performance and learning meaningful associations between emotion classes and words in the sentence.

pdf bib abs
Paladin: an annotation tool based on active and proactive learning
Minh-Quoc Nghiem | Paul Baylis | Sophia Ananiadou
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations

In this paper, we present Paladin, an open-source web-based annotation tool for creating high-quality multi-label document-level datasets. By integrating active learning and proactive learning to the annotation task, Paladin makes the task less time-consuming and requiring less human effort. Although Paladin is designed for multi-label settings, the system is flexible and can be adapted to other tasks in single-label settings.

pdf bib
Investigating Text Simplification Evaluation
Laura Vásquez-Rodríguez | Matthew Shardlow | Piotr Przybyła | Sophia Ananiadou
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib abs
GenerativeRE: Incorporating a Novel Copy Mechanism and Pretrained Model for Joint Entity and Relation Extraction
Jiarun Cao | Sophia Ananiadou
Findings of the Association for Computational Linguistics: EMNLP 2021

Previous neural Seq2Seq models have shown the effectiveness for jointly extracting relation triplets. However, most of these models suffer from incompletion and disorder problems when they extract multi-token entities from input sentences. To tackle these problems, we propose a generative, multi-task learning framework, named GenerativeRE. We firstly propose a special entity labelling method on both input and output sequences. During the training stage, GenerativeRE fine-tunes the pre-trained generative model and learns the special entity labels simultaneously. During the inference stage, we propose a novel copy mechanism equipped with three mask strategies, to generate the most probable tokens by diminishing the scope of the model decoder. Experimental results show that our model achieves 4.6% and 0.9% F1 score improvements over the current state-of-the-art methods in the NYT24 and NYT29 benchmark datasets respectively.

pdf bib abs
Distantly Supervised Relation Extraction with Sentence Reconstruction and Knowledge Base Priors
Fenia Christopoulou | Makoto Miwa | Sophia Ananiadou
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

We propose a multi-task, probabilistic approach to facilitate distantly supervised relation extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. To achieve this, we bias the latent space of sentences via a Variational Autoencoder (VAE) that is trained jointly with a relation classifier. The latent code guides the pair representations and influences sentence reconstruction. Experimental results on two datasets created via distant supervision indicate that multi-task learning results in performance benefits. Additional exploration of employing Knowledge Base priors into theVAE reveals that the sentence space can be shifted towards that of the Knowledge Base, offering interpretability and further improving results.

2020

pdf bib abs
Revisiting Unsupervised Relation Extraction
Thy Thy Tran | Phong Le | Sophia Ananiadou
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

Unsupervised relation extraction (URE) extracts relations between named entities from raw text without manually-labelled data and existing knowledge bases (KBs). URE methods can be categorised into generative and discriminative approaches, which rely either on hand-crafted features or surface form. However, we demonstrate that by using only named entities to induce relation types, we can outperform existing methods on two popular datasets. We conduct a comparison and evaluation of our findings with other URE techniques, to ascertain the important features in URE. We conclude that entity types provide a strong inductive bias for URE.

pdf bib
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing

pdf bib abs
A Neural Model for Aggregating Coreference Annotation in Crowdsourcing
Maolin Li | Hiroya Takamura | Sophia Ananiadou
Proceedings of the 28th International Conference on Computational Linguistics

Coreference resolution is the task of identifying all mentions in a text that refer to the same real-world entity. Collecting sufficient labelled data from expert annotators to train a high-performance coreference resolution system is time-consuming and expensive. Crowdsourcing makes it possible to obtain the required amounts of data rapidly and cost-effectively. However, crowd-sourced labels can be noisy. To ensure high-quality data, it is crucial to infer the correct labels by aggregating the noisy labels. In this paper, we split the aggregation into two subtasks, i.e, mention classification and coreference chain inference. Firstly, we predict the general class of each mention using an autoencoder, which incorporates contextual information about each mention, while at the same time taking into account the mention’s annotation complexity and annotators’ reliability at different levels. Secondly, to determine the coreference chain of each mention, we use weighted voting which takes into account the learned reliability in the first subtask. Experimental results demonstrate the effectiveness of our method in predicting the correct labels. We also illustrate our model’s interpretability through a comprehensive analysis of experimental results.

pdf bib abs
Semantic Annotation for Improved Safety in Construction Work
Paul Thompson | Tim Yates | Emrah Inan | Sophia Ananiadou
Proceedings of the Twelfth Language Resources and Evaluation Conference

Risk management is a vital activity to ensure employee safety in construction projects. Various documents provide important supporting evidence, including details of previous incidents, consequences and mitigation strategies. Potential hazards may depend on a complex set of project-specific attributes, including activities undertaken, location, equipment used, etc. However, finding evidence about previous projects with similar attributes can be problematic, since information about risks and mitigations is usually hidden within and may be dispersed across a range of different free text documents. Automatic named entity recognition (NER), which identifies mentions of concepts in free text documents, is the first stage in structuring knowledge contained within them. While developing NER methods generally relies on annotated corpora, we are not aware of any such corpus targeted at concepts relevant to construction safety. In response, we have designed a novel named entity annotation scheme and associated guidelines for this domain, which covers hazards, consequences, mitigation strategies and project attributes. Four health and safety experts used the guidelines to annotate a total of 600 sentences from accident reports; an average inter-annotator agreement rate of 0.79 F-Score shows that our work constitutes an important first step towards developing tools for detailed semantic analysis of construction safety documents.

2019

pdf bib abs
A Search-based Neural Model for Biomedical Nested and Overlapping Event Detection
Kurt Junshean Espinosa | Makoto Miwa | Sophia Ananiadou
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

We tackle the nested and overlapping event detection task and propose a novel search-based neural network (SBNN) structured prediction model that treats the task as a search problem on a relation graph of trigger-argument structures. Unlike existing structured prediction tasks such as dependency parsing, the task targets to detect DAG structures, which constitute events, from the relation graph. We define actions to construct events and use all the beams in a beam search to detect all event structures that may be overlapping and nested. The search process constructs events in a bottom-up manner while modelling the global properties for nested and overlapping structures simultaneously using neural networks. We show that the model achieves performance comparable to the state-of-the-art model Turku Event Extraction System (TEES) on the BioNLP Cancer Genetics (CG) Shared Task 2013 without the use of any syntactic and hand-engineered features. Further analyses on the development set show that our model is more computationally efficient while yielding higher F1-score performance.

pdf bib abs
Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs
Fenia Christopoulou | Makoto Miwa | Sophia Ananiadou
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)

Document-level relation extraction is a complex human process that requires logical inference to extract relationships between named entities in text. Existing approaches use graph-based neural models with words as nodes and edges as relations between them, to encode relations across sentences. These models are node-based, i.e., they form pair representations based solely on the two target node representations. However, entity relations can be better expressed through unique edge representations formed as paths between nodes. We thus propose an edge-oriented graph neural model for document-level relation extraction. The model utilises different types of nodes and edges to create a document-level graph. An inference mechanism on the graph edges enables to learn intra- and inter-sentence relations using multi-instance learning internally. Experiments on two document-level biomedical datasets for chemical-disease and gene-disease associations show the usefulness of the proposed edge-oriented approach.

pdf bib abs
Coreference Resolution in Full Text Articles with BERT and Syntax-based Mention Filtering
Hai-Long Trieu | Anh-Khoa Duong Nguyen | Nhung Nguyen | Makoto Miwa | Hiroya Takamura | Sophia Ananiadou
Proceedings of the 5th Workshop on BioNLP Open Shared Tasks

This paper describes our system developed for the coreference resolution task of the CRAFT Shared Tasks 2019. The CRAFT corpus is more challenging than other existing corpora because it contains full text articles. We have employed an existing span-based state-of-theart neural coreference resolution system as a baseline system. We enhance the system with two different techniques to capture longdistance coreferent pairs. Firstly, we filter noisy mentions based on parse trees with increasing the number of antecedent candidates. Secondly, instead of relying on the LSTMs, we integrate the highly expressive language model–BERT into our model. Experimental results show that our proposed systems significantly outperform the baseline. The best performing system obtained F-scores of 44%, 48%, 39%, 49%, 40%, and 57% on the test set with B3, BLANC, CEAFE, CEAFM, LEA, and MUC metrics, respectively. Additionally, the proposed model is able to detect coreferent pairs in long distances, even with a distance of more than 200 sentences.

pdf bib abs
Modelling Instance-Level Annotator Reliability for Natural Language Labelling Tasks
Maolin Li | Arvid Fahlström Myrman | Tingting Mu | Sophia Ananiadou
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

When constructing models that learn from noisy labels produced by multiple annotators, it is important to accurately estimate the reliability of annotators. Annotators may provide labels of inconsistent quality due to their varying expertise and reliability in a domain. Previous studies have mostly focused on estimating each annotator’s overall reliability on the entire annotation task. However, in practice, the reliability of an annotator may depend on each specific instance. Only a limited number of studies have investigated modelling per-instance reliability and these only considered binary labels. In this paper, we propose an unsupervised model which can handle both binary and multi-class labels. It can automatically estimate the per-instance reliability of each annotator and the correct label for each instance. We specify our model as a probabilistic model which incorporates neural networks to model the dependency between latent variables and instances. For evaluation, the proposed method is applied to both synthetic and real data, including two labelling tasks: text classification and textual entailment. Experimental results demonstrate our novel method can not only accurately estimate the reliability of annotators across different instances, but also achieve superior performance in predicting the correct labels and detecting the least reliable annotators compared to state-of-the-art baselines.

pdf bib abs
Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network
Sunil Kumar Sahu | Fenia Christopoulou | Makoto Miwa | Sophia Ananiadou
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Inter-sentence relation extraction deals with a number of complex semantic relationships in documents, which require local, non-local, syntactic and semantic dependencies. Existing methods do not fully exploit such dependencies. We present a novel inter-sentence relation extraction model that builds a labelled edge graph convolutional neural network model on a document-level graph. The graph is constructed using various inter- and intra-sentence dependencies to capture local and non-local dependency information. In order to predict the relation of an entity pair, we utilise multi-instance learning with bi-affine pairwise scoring. Experimental results show that our model achieves comparable performance to the state-of-the-art neural models on two biochemistry datasets. Our analysis shows that all the types in the graph are effective for inter-sentence relation extraction.

pdf bib
Proceedings of the 18th BioNLP Workshop and Shared Task
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 18th BioNLP Workshop and Shared Task

pdf bib abs
Improving classification of Adverse Drug Reactions through Using Sentiment Analysis and Transfer Learning
Hassan Alhuzali | Sophia Ananiadou
Proceedings of the 18th BioNLP Workshop and Shared Task

The availability of large-scale and real-time data on social media has motivated research into adverse drug reactions (ADRs). ADR classification helps to identify negative effects of drugs, which can guide health professionals and pharmaceutical companies in making medications safer and advocating patients’ safety. Based on the observation that in social media, negative sentiment is frequently expressed towards ADRs, this study presents a neural model that combines sentiment analysis with transfer learning techniques to improve ADR detection in social media postings. Our system is firstly trained to classify sentiment in tweets concerning current affairs, using the SemEval17-task4A corpus. We then apply transfer learning to adapt the model to the task of detecting ADRs in social media postings. We show that, in combination with rich representations of words and their contexts, transfer learning is beneficial, especially given the large degree of vocabulary overlap between the current affairs posts in the SemEval17-task4A corpus and posts about ADRs. We compare our results with previous approaches, and show that our model can outperform them by up to 3% F-score.

2018

pdf bib abs
APLenty: annotation tool for creating high-quality datasets using active and proactive learning
Minh-Quoc Nghiem | Sophia Ananiadou
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

In this paper, we present APLenty, an annotation tool for creating high-quality sequence labeling datasets using active and proactive learning. A major innovation of our tool is the integration of automatic annotation with active learning and proactive learning. This makes the task of creating labeled datasets easier, less time-consuming and requiring less human effort. APLenty is highly flexible and can be adapted to various other tasks.

pdf bib abs
A Neural Layered Model for Nested Named Entity Recognition
Meizhi Ju | Makoto Miwa | Sophia Ananiadou
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Entity mentions embedded in longer entity mentions are referred to as nested entities. Most named entity recognition (NER) systems deal only with the flat entities and ignore the inner nested ones, which fails to capture finer-grained semantic information in underlying texts. To address this issue, we propose a novel neural model to identify nested entities by dynamically stacking flat NER layers. Each flat NER layer is based on the state-of-the-art flat NER model that captures sequential context representation with bidirectional Long Short-Term Memory (LSTM) layer and feeds it to the cascaded CRF layer. Our model merges the output of the LSTM layer in the current flat NER layer to build new representation for detected entities and subsequently feeds them into the next flat NER layer. This allows our model to extract outer entities by taking full advantage of information encoded in their corresponding inner entities, in an inside-to-outside way. Our model dynamically stacks the flat NER layers until no outer entities are extracted. Extensive evaluation shows that our dynamic model outperforms state-of-the-art feature-based systems on nested NER, achieving 74.7% and 72.2% on GENIA and ACE2005 datasets, respectively, in terms of F-score.

pdf bib abs
A Walk-based Model on Entity Graphs for Relation Extraction
Fenia Christopoulou | Makoto Miwa | Sophia Ananiadou
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

We present a novel graph-based neural network model for relation extraction. Our model treats multiple pairs in a sentence simultaneously and considers interactions among them. All the entities in a sentence are placed as nodes in a fully-connected graph structure. The edges are represented with position-aware contexts around the entity pairs. In order to consider different relation paths between two entities, we construct up to l-length walks between each pair. The resulting walks are merged and iteratively used to update the edge representations into longer walks representations. We show that the model achieves performance comparable to the state-of-the-art systems on the ACE 2005 dataset without using any external tools.

pdf bib abs
Paths for uncertainty: Exploring the intricacies of uncertainty identification for news
Chrysoula Zerva | Sophia Ananiadou
Proceedings of the Workshop on Computational Semantics beyond Events and Roles

Currently, news articles are produced, shared and consumed at an extremely rapid rate. Although their quantity is increasing, at the same time, their quality and trustworthiness is becoming fuzzier. Hence, it is important not only to automate information extraction but also to quantify the certainty of this information. Automated identification of certainty has been studied both in the scientific and newswire domains, but performance is considerably higher in tasks focusing on scientific text. We compare the differences in the definition and expression of uncertainty between a scientific domain, i.e., biomedicine, and newswire. We delve into the different aspects that affect the certainty of an extracted event in a news article and examine whether they can be easily identified by techniques already validated in the biomedical domain. Finally, we present a comparison of the syntactic and lexical differences between the the expression of certainty in the biomedical and newswire domains, using two annotated corpora.

pdf bib
Proceedings of the BioNLP 2018 workshop
Dina Demner-Fushman | Kevin Bretonnel Cohen | Sophia Ananiadou | Junichi Tsujii
Proceedings of the BioNLP 2018 workshop

pdf bib abs
Investigating Domain-Specific Information for Neural Coreference Resolution on Biomedical Texts
Hai-Long Trieu | Nhung T. H. Nguyen | Makoto Miwa | Sophia Ananiadou
Proceedings of the BioNLP 2018 workshop

Existing biomedical coreference resolution systems depend on features and/or rules based on syntactic parsers. In this paper, we investigate the utility of the state-of-the-art general domain neural coreference resolution system on biomedical texts. The system is an end-to-end system without depending on any syntactic parsers. We also investigate the domain specific features to enhance the system for biomedical texts. Experimental results on the BioNLP Protein Coreference dataset and the CRAFT corpus show that, with no parser information, the adapted system compared favorably with the systems that depend on parser information on these datasets, achieving 51.23% on the BioNLP dataset and 36.33% on the CRAFT corpus in F1 score. In-domain embeddings and domain-specific features helped improve the performance on the BioNLP dataset, but they did not on the CRAFT corpus.

2017

pdf bib abs
Distributed Document and Phrase Co-embeddings for Descriptive Clustering
Motoki Sato | Austin J. Brockmeier | Georgios Kontonatsios | Tingting Mu | John Y. Goulermas | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers

Descriptive document clustering aims to automatically discover groups of semantically related documents and to assign a meaningful label to characterise the content of each cluster. In this paper, we present a descriptive clustering approach that employs a distributed representation model, namely the paragraph vector model, to capture semantic similarities between documents and phrases. The proposed method uses a joint representation of phrases and documents (i.e., a co-embedding) to automatically select a descriptive phrase that best represents each document cluster. We evaluate our method by comparing its performance to an existing state-of-the-art descriptive clustering method that also uses co-embedding but relies on a bag-of-words representation. Results obtained on benchmark datasets demonstrate that the paragraph vector-based method obtains superior performance over the existing approach in both identifying clusters and assigning appropriate descriptive labels to them.

pdf bib
Proceedings of the 16th BioNLP Workshop
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Junichi Tsujii
Proceedings of the 16th BioNLP Workshop

pdf bib abs
Proactive Learning for Named Entity Recognition
Maolin Li | Nhung Nguyen | Sophia Ananiadou
Proceedings of the 16th BioNLP Workshop

The goal of active learning is to minimise the cost of producing an annotated dataset, in which annotators are assumed to be perfect, i.e., they always choose the correct labels. However, in practice, annotators are not infallible, and they are likely to assign incorrect labels to some instances. Proactive learning is a generalisation of active learning that can model different kinds of annotators. Although proactive learning has been applied to certain labelling tasks, such as text classification, there is little work on its application to named entity (NE) tagging. In this paper, we propose a proactive learning method for producing NE annotated corpora, using two annotators with different levels of expertise, and who charge different amounts based on their levels of experience. To optimise both cost and annotation quality, we also propose a mechanism to present multiple sentences to annotators at each iteration. Experimental results for several corpora show that our method facilitates the construction of high-quality NE labelled datasets at minimal cost.

2016

pdf bib abs
Ensemble Classification of Grants using LDA-based Features
Yannis Korkontzelos | Beverley Thomas | Makoto Miwa | Sophia Ananiadou
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Classifying research grants into useful categories is a vital task for a funding body to give structure to the portfolio for analysis, informing strategic planning and decision-making. Automating this classification process would save time and effort, providing the accuracy of the classifications is maintained. We employ five classification models to classify a set of BBSRC-funded research grants in 21 research topics based on unigrams, technical terms and Latent Dirichlet Allocation models. To boost precision, we investigate methods for combining their predictions into five aggregate classifiers. Evaluation confirmed that ensemble classification models lead to higher precision. It was observed that there is not a single best-performing aggregate method for all research topics. Instead, the best-performing method for a research topic depends on the number of positive training instances available for this topic. Subject matter experts considered the predictions of aggregate models to correct erroneous or incomplete manual assignments.

pdf bib abs
Identifying Content Types of Messages Related to Open Source Software Projects
Yannis Korkontzelos | Paul Thompson | Sophia Ananiadou
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

Assessing the suitability of an Open Source Software project for adoption requires not only an analysis of aspects related to the code, such as code quality, frequency of updates and new version releases, but also an evaluation of the quality of support offered in related online forums and issue trackers. Understanding the content types of forum messages and issue trackers can provide information about the extent to which requests are being addressed and issues are being resolved, the percentage of issues that are not being fixed, the cases where the user acknowledged that the issue was successfully resolved, etc. These indicators can provide potential adopters of the OSS with estimates about the level of available support. We present a detailed hierarchy of content types of online forum messages and issue tracker comments and a corpus of messages annotated accordingly. We discuss our experiments to classify forum messages and issue tracker comments into content-related classes, i.e.~to assign them to nodes of the hierarchy. The results are very encouraging.

pdf bib
NaCTeM at SemEval-2016 Task 1: Inferring sentence-level semantic similarity from an ensemble of complementary lexical and sentence-level features
Piotr Przybyła | Nhung T. H. Nguyen | Matthew Shardlow | Georgios Kontonatsios | Sophia Ananiadou
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

pdf bib
Proceedings of the 15th Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of the 15th Workshop on Biomedical Natural Language Processing

pdf bib abs
Learning to recognise named entities in tweets by exploiting weakly labelled data
Kurt Junshean Espinosa | Riza Theresa Batista-Navarro | Sophia Ananiadou
Proceedings of the 2nd Workshop on Noisy User-generated Text (WNUT)

Named entity recognition (NER) in social media (e.g., Twitter) is a challenging task due to the noisy nature of text. As part of our participation in the W-NUT 2016 Named Entity Recognition Shared Task, we proposed an unsupervised learning approach using deep neural networks and leverage a knowledge base (i.e., DBpedia) to bootstrap sparse entity types with weakly labelled data. To further boost the performance, we employed a more sophisticated tagging scheme and applied dropout as a regularisation technique in order to reduce overfitting. Even without hand-crafting linguistic features nor leveraging any of the W-NUT-provided gazetteers, we obtained robust performance with our approach, which ranked third amongst all shared task participants according to the official evaluation on a gold standard named entity-annotated corpus of 3,856 tweets.

pdf bib
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)
Sophia Ananiadou | Riza Batista-Navarro | Kevin Bretonnel Cohen | Dina Demner-Fushman | Paul Thompson
Proceedings of the Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM2016)

2015

pdf bib
Proceedings of BioNLP 15
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 15

pdf bib
Event Extraction in pieces:Tackling the partial event identification problem on unseen corpora
Chrysoula Zerva | Sophia Ananiadou
Proceedings of BioNLP 15

2014

pdf bib
Comparable Study of Event Extraction in Newswire and Biomedical Domains
Makoto Miwa | Paul Thompson | Ioannis Korkontzelos | Sophia Ananiadou
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora
Georgios Kontonatsios | Ioannis Korkontzelos | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
Using a Random Forest Classifier to Compile Bilingual Dictionaries of Technical Terms from Comparable Corpora
Georgios Kontonatsios | Ioannis Korkontzelos | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

pdf bib abs
Interoperability and Customisation of Annotation Schemata in Argo
Rafal Rak | Jacob Carter | Andrew Rowley | Riza Theresa Batista-Navarro | Sophia Ananiadou
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

The process of annotating text corpora involves establishing annotation schemata which define the scope and depth of an annotation task at hand. We demonstrate this activity in Argo, a Web-based workbench for the analysis of textual resources, which facilitates both automatic and manual annotation. Annotation tasks in the workbench are defined by building workflows consisting of a selection of available elementary analytics developed in compliance with the Unstructured Information Management Architecture specification. The architecture accommodates complex annotation types that may define primitive as well as referential attributes. Argo aids the development of custom annotation schemata and supports their interoperability by featuring a schema editor and specialised analytics for schemata alignment. The schema editor is a self-contained graphical user interface for defining annotation types. Multiple heterogeneous schemata can be aligned by including one of two type mapping analytics currently offered in Argo. One is based on a simple mapping syntax and, although limited in functionality, covers most common use cases. The other utilises a well established graph query language, SPARQL, and is superior to other state-of-the-art solutions in terms of expressiveness. We argue that the customisation of annotation schemata does not need to compromise their interoperability.

pdf bib abs
The Meta-knowledge of Causality in Biomedical Scientific Discourse
Claudiu Mihăilă | Sophia Ananiadou
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Causality lies at the heart of biomedical knowledge, being involved in diagnosis, pathology or systems biology. Thus, automatic causality recognition can greatly reduce the human workload by suggesting possible causal connections and aiding in the curation of pathway models. For this, we rely on corpora that are annotated with classified, structured representations of important facts and findings contained within text. However, it is impossible to correctly interpret these annotations without additional information, e.g., classification of an event as fact, hypothesis, experimental result or analysis of results, confidence of authors about the validity of their analyses etc. In this study, we analyse and automatically detect this type of information, collectively termed meta-knowledge (MK), in the context of existing discourse causality annotations. Our effort proves the feasibility of identifying such pieces of information, without which the understanding of causal relations is limited.

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiatives work throughout Europe in order to boost progress and innovation in our field.

pdf bib abs
Locating Requests among Open Source Software Communication Messages
Ioannis Korkontzelos | Sophia Ananiadou
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

As a first step towards assessing the quality of support offered online for Open Source Software (OSS), we address the task of locating requests, i.e., messages that raise an issue to be addressed by the OSS community, as opposed to any other message. We present a corpus of online communication messages randomly sampled from newsgroups and bug trackers, manually annotated as requests or non-requests. We identify several linguistically shallow, content-based heuristics that correlate with the classification and investigate the extent to which they can serve as independent classification criteria. Then, we train machine-learning classifiers on these heuristics. We experiment with a wide range of settings, such as different learners, excluding some heuristics and adding unigram features of various parts-of-speech and frequency. We conclude that some heuristics can perform well, while their accuracy can be improved further using machine learning, at the cost of obtaining manual annotations.

pdf bib
Keynote: Supporting evidence-based medicine using text mining
Sophia Ananiadou
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)

pdf bib
Building a semantically annotated corpus for congestive heart and renal failure from clinical records and the literature
Noha Alnazzawi | Paul Thompson | Sophia Ananiadou
Proceedings of the 5th International Workshop on Health Text Mining and Information Analysis (Louhi)

pdf bib
Proceedings of BioNLP 2014
Kevin Cohen | Dina Demner-Fushman | Sophia Ananiadou | Jun-ichi Tsujii
Proceedings of BioNLP 2014

2013

pdf bib
What causes a causal relation? Detecting Causal Triggers in Biomedical Scientific Discourse
Claudiu Mihăilă | Sophia Ananiadou
51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop

pdf bib
Extending an interoperable platform to facilitate the creation of multilingual and multimodal NLP applications
Georgios Kontonatsios | Paul Thompson | Riza Theresa Batista-Navarro | Claudiu Mihăilă | Ioannis Korkontzelos | Sophia Ananiadou
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Development and Analysis of NLP Pipelines in Argo
Rafal Rak | Andrew Rowley | Jacob Carter | Sophia Ananiadou
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations

pdf bib
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing
Kevin Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii
Proceedings of the 2013 Workshop on Biomedical Natural Language Processing

pdf bib
Overview of the Cancer Genetics (CG) task of BioNLP Shared Task 2013
Sampo Pyysalo | Tomoko Ohta | Sophia Ananiadou
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
NaCTeM EventMine for BioNLP 2013 CG and PC tasks
Makoto Miwa | Sophia Ananiadou
Proceedings of the BioNLP Shared Task 2013 Workshop

pdf bib
Towards a Better Understanding of Discourse: Integrating Multiple Discourse Annotation Perspectives Using UIMA
Claudiu Mihăilă | Georgios Kontonatsios | Riza Theresa Batista-Navarro | Paul Thompson | Ioannis Korkontzelos | Sophia Ananiadou
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

pdf bib
Making UIMA Truly Interoperable with SPARQL
Rafal Rak | Sophia Ananiadou
Proceedings of the 7th Linguistic Annotation Workshop and Interoperability with Discourse

pdf bib
Using a Random Forest Classifier to recognise translations of biomedical terms across languages
Georgios Kontonatsios | Ioannis Korkontzelos | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Sixth Workshop on Building and Using Comparable Corpora

2012

pdf bib
brat: a Web-based Tool for NLP-Assisted Text Annotation
Pontus Stenetorp | Sampo Pyysalo | Goran Topić | Tomoko Ohta | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf bib abs
Biomedical Chinese-English CLIR Using an Extended CMeSH Resource to Expand Queries
Xinkai Wang | Paul Thompson | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Cross-lingual information retrieval (CLIR) involving the Chinese language has been thoroughly studied in the general language domain, but rarely in the biomedical domain, due to the lack of suitable linguistic resources and parsing tools. In this paper, we describe a Chinese-English CLIR system for biomedical literature, which exploits a bilingual ontology, the ``eCMeSH Tree"""". This is an extension of the Chinese Medical Subject Headings (CMeSH) Tree, based on Medical Subject Headings (MeSH). Using the 2006 and 2007 TREC Genomics track data, we have evaluated the performance of the eCMeSH Tree in expanding queries. We have compared our results to those obtained using two other approaches, i.e. pseudo-relevance feedback (PRF) and document translation (DT). Subsequently, we evaluate the performance of different combinations of these three retrieval methods. Our results show that our method of expanding queries using the eCMeSH Tree can outperform the PRF method. Furthermore, combining this method with PRF and DT helps to smooth the differences in query expansion, and consequently results in the best performance amongst all experiments reported. All experiments compare the use of two different retrieval models, i.e. Okapi BM25 and a query likelihood language model. In general, the former performs slightly better.

pdf bib abs
Identification of Manner in Bio-Events
Raheel Nawaz | Paul Thompson | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Due to the rapid growth in the volume of biomedical literature, there is an increasing requirement for high-performance semantic search systems, which allow biologists to perform precise searches for events of interest. Such systems are usually trained on corpora of documents that contain manually annotated events. Until recently, these corpora, and hence the event extraction systems trained on them, focussed almost exclusively on the identification and classification of event arguments, without taking into account how the textual context of the events could affect their interpretation. Previously, we designed an annotation scheme to enrich events with several aspects (or dimensions) of interpretation, which we term meta-knowledge, and applied this scheme to the entire GENIA corpus. In this paper, we report on our experiments to automate the assignment of one of these meta-knowledge dimensions, i.e. Manner, to recognised events. Manner is concerned with the rate, strength intensity or level of the event. We distinguish three different values of manner, i.e., High, Low and Neutral. To our knowledge, our work represents the first attempt to classify the manner of events. Using a combination of lexical, syntactic and semantic features, our system achieves an overall accuracy of 99.4%.

pdf bib abs
Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench
Rafal Rak | Andrew Rowley | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Challenges in creating comprehensive text-processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know programming techniques to compose such workflows. In this paper we demonstrate Argo, a web-based system that addresses these issues in several ways. It supports the widely adopted Unstructured Information Management Architecture (UIMA), which handles the problem of interoperability; it provides a web browser-based interface for developing workflows by drawing diagrams composed of a selection of available processing components; and it provides novel user-interactive analytics such as the annotation editor which constitutes a bridge between automatic processing and manual correction. These features extend the target audience of Argo to users with a limited or no technical background. Here, we focus specifically on the construction of advanced workflows, involving multiple branching and merging points, to facilitate various comparative evalutions. Together with the use of user-collaboration capabilities supported in Argo, we demonstrate several use cases including visual inspections, comparisions of multiple processing segments or complete solutions against a reference standard, inter-annotator agreement, and shared task mass evaluations. Ultimetely, Argo emerges as a one-stop workbench for defining, processing, editing and evaluating text processing tasks.

pdf bib abs
A data and analysis resource for an experiment in text mining a collection of micro-blogs on a political topic.
William Black | Rob Procter | Steven Gray | Sophia Ananiadou
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

The analysis of a corpus of micro-blogs on the topic of the 2011 UK referendum about the Alternative Vote has been undertaken as a joint activity by text miners and social scientists. To facilitate the collaboration, the corpus and its analysis is managed in a Web-accessible framework that allows users to upload their own textual data for analysis and to manage their own text annotation resources used for analysis. The framework also allows annotations to be searched, and the analysis to be re-run after amending the analysis resources. The corpus is also doubly human-annotated stating both whether each tweet is overall positive or negative in sentiment and whether it is for or against the proposition of the referendum.

pdf bib
Building Trainable Taggers in a Web-based, UIMA-Supported NLP Workbench
Rafal Rak | BalaKrishna Kolluru | Sophia Ananiadou
Proceedings of the ACL 2012 System Demonstrations

pdf bib
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing
Kevin B. Cohen | Dina Demner-Fushman | Sophia Ananiadou | Bonnie Webber | Jun’ichi Tsujii | John Pestian
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

pdf bib
New Resources and Perspectives for Biomedical Event Extraction
Sampo Pyysalo | Pontus Stenetorp | Tomoko Ohta | Jin-Dong Kim | Sophia Ananiadou
BioNLP: Proceedings of the 2012 Workshop on Biomedical Natural Language Processing

pdf bib
Bridging the Gap Between Scope-based and Event-based Negation/Speculation Annotations: A Bridge Not Too Far
Pontus Stenetorp | Sampo Pyysalo | Tomoko Ohta | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics

pdf bib
Open-domain Anatomical Entity Mention Detection
Tomoko Ohta | Sampo Pyysalo | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Workshop on Detecting Structure in Scholarly Discourse

pdf bib
A three-way perspective on scientific discourse annotation for knowledge extraction
Maria Liakata | Paul Thompson | Anita de Waard | Raheel Nawaz | Henk Pander Maat | Sophia Ananiadou
Proceedings of the Workshop on Detecting Structure in Scholarly Discourse

2011

pdf bib
Building a Coreference-Annotated Corpus from the Domain of Biochemistry
Riza Theresa Batista-Navarro | Sophia Ananiadou
Proceedings of BioNLP 2011 Workshop

pdf bib
Enrichment and Structuring of Archival Description Metadata
Kalliopi Zervanou | Ioannis Korkontzelos | Antal van den Bosch | Sophia Ananiadou
Proceedings of the 5th ACL-HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

2010

pdf bib
Imbalanced Classification Using Dictionary-based Prototypes and Hierarchical Decision Rules for Entity Sense Disambiguation
Tingting Mu | Xinglong Wang | Jun’ichi Tsujii | Sophia Ananiadou
Coling 2010: Posters

pdf bib abs
Evaluating a Text Mining Based Educational Search Portal
Sophia Ananiadou | John McNaught | James Thomas | Mark Rickinson | Sandy Oliver
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we present the main features of a text mining based search engine for the UK Educational Evidence Portal available at the UK National Centre for Text Mining (NaCTeM), together with a user-centred framework for the evaluation of the search engine. The framework is adapted from an existing proposal by the ISLE (EAGLES) Evaluation Working group. We introduce the metrics employed for the evaluation, and explain how these relate to the text mining based search engine. Following this, we describe how we applied the framework to the evaluation of a number of key text mining features of the search engine, namely the automatic clustering of search results, classification of search results according to a taxonomy, and identification of topics and other documents that are related to a chosen document. Finally, we present the results of the evaluation in terms of the strengths, weaknesses and improvements identified for each of these features.

pdf bib abs
Meta-Knowledge Annotation of Bio-Events
Raheel Nawaz | Paul Thompson | John McNaught | Sophia Ananiadou
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Biomedical corpora annotated with event-level information provide an important resource for the training of domain-specific information extraction (IE) systems. These corpora concentrate primarily on creating classified, structured representations of important facts and findings contained within the text. However, bio-event annotations often do not take into account additional information (meta-knowledge) that is expressed within the textual context of the bio-event, e.g., the pragmatic/rhetorical intent and the level of certainty ascribed to a particular bio-event by the authors. Such additional information is indispensible for correct interpretation of bio-events. Therefore, an IE system that simply presents a list of bare bio-events, without information concerning their interpretation, is of little practical use. We have addressed this sparseness of meta-knowledge available in existing bio-event corpora by developing a multi-dimensional annotation scheme tailored to bio-events. The scheme is intended to be general enough to allow integration with different types of bio-event annotation, whilst being detailed enough to capture important subtleties in the nature of the meta-knowledge expressed about different bio-events. To our knowledge, our scheme is unique within the field with regards to the diversity of meta-knowledge aspects annotated for each event.

pdf bib abs
U-Compare: An Integrated Language Resource Evaluation Platform Including a Comprehensive UIMA Resource Library
Yoshinobu Kano | Ruben Dorado | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

Language resources, including corpus and tools, are normally required to be combined in order to achieve a users specific task. However, resources tend to be developed independently in different, incompatible formats. In this paper we describe about U-Compare, which consists of the U-Compare component repository and the U-Compare platform. We have been building a highly interoperable resource library, providing the world largest ready-to-use UIMA component repository including wide variety of corpus readers and state-of-the-art language tools. These resources can be deployed as local services or web services, even possible to be hosted in clustered machines to increase the performance, while users do not need to be aware of such differences. In addition to the resource library, an integrated language processing platform is provided, allowing workflow creation, comparison, evaluation and visualization, using the resources in the library or any UIMA component, without any programming via graphical user interfaces, while a command line launcher is also available without GUIs. The evaluation itself is processed in a UIMA component, users can create and plug their own evaluation metrics in addition to the predefined metrics. U-Compare has been successfully used in many projects including BioCreative, Conll and the BioNLP shared task.

pdf bib
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing
K. Bretonnel Cohen | Dina Demner-Fushman | Sophia Ananiadou | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the 2010 Workshop on Biomedical Natural Language Processing

pdf bib
Evaluating a meta-knowledge annotation scheme for bio-events
Raheel Nawaz | Paul Thompson | Sophia Ananiadou
Proceedings of the Workshop on Negation and Speculation in Natural Language Processing

2009

L’analyse qualitative des données demande au sociologue un important travail de sélection et d’interprétation des documents. Afin de faciliter ce travail, cette communauté c’est dotée d’outils informatique mais leur fonctionnalités sont encore limitées. Le projet ASSIST est une étude exploratoire pour préciser les modules de traitement automatique des langues (TAL) permettant d’assister le sociologue dans son travail d’analyse. Nous présentons le moteur de recherche réalisé et nous justifions le choix des composants de TAL intégrés au prototype.

pdf bib
Classifying Relations for Biomedical Named Entity Disambiguation
Xinglong Wang | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing

pdf bib
Fast Full Parsing by Linear-Chain Conditional Random Fields
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)

pdf bib
Three BioNLP Tools Powered by a Biological Lexicon
Yutaka Sasaki | Paul Thompson | John McNaught | Sophia Ananiadou
Proceedings of the Demonstrations Session at EACL 2009

pdf bib
Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP

pdf bib
Integrated NLP Evaluation System for Pluggable Evaluation Metrics with Extensive Interoperable Toolkit
Yoshinobu Kano | Luke McCrohon | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the Workshop on Software Engineering, Testing, and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)

2008

pdf bib
A Discriminative Alignment Model for Abbreviation Recognition
Naoaki Okazaki | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Event Frame Extraction Based on a Gene Regulation Corpus
Yutaka Sasaki | Paul Thompson | Philip Cotter | John McNaught | Sophia Ananiadou
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
A Discriminative Candidate Generator for String Transformations
Naoaki Okazaki | Yoshimasa Tsuruoka | Sophia Ananiadou | Jun’ichi Tsujii
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing

pdf bib
Identifying Sections in Scientific Abstracts using Conditional Random Fields
Kenji Hirohata | Naoaki Okazaki | Sophia Ananiadou | Mitsuru Ishizuka
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib abs
Clustering Related Terms with Definitions
Scott Piao | John McNaught | Sophia Ananiadou
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

It is a challenging task to match similar or related terms/expressions in NLP and Text Mining applications. Two typical areas in need for such work are terminology and ontology constructions, where terms and concepts are extracted and organized into certain structures with various semantic relations. In the EU BOOTSTrep Project we test various techniques for matching terms that can assist human domain experts in building and enriching ontologies. This paper reports on a work in which we evaluated a text comparing and clustering tool for this task. Particularly, we explore the feasibility of matching related terms with their definitions. Ontology terms, such as Gene Ontology terms, are often assigned with detailed definitions, which provide a fundamental information source for detecting relations between terms. Here we focus on the exploitation of term definitions for the term matching task. Our experiment shows that the tool is capable of grouping many related terms using their definitions.

pdf bib abs
Building a Bio-Event Annotated Corpus for the Acquisition of Semantic Frames from Biomedical Corpora
Paul Thompson | Philip Cotter | John McNaught | Sophia Ananiadou | Simonetta Montemagni | Andrea Trabucco | Giulia Venturi
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper reports on the design and construction of a bio-event annotated corpus which was developed with a specific view to the acquisition of semantic frames from biomedical corpora. We describe the adopted annotation scheme and the annotation process, which is supported by a dedicated annotation tool. The annotated corpus contains 677 abstracts of biomedical research articles.

Many systems have been developed in the past few years to assist researchers in the discovery of knowledge published as English text, for example in the PubMed database. At the same time, higher level collective knowledge is often published using a graphical notation representing all the entities in a pathway and their interactions. We believe that these pathway visualizations could serve as an effective user interface for knowledge discovery if they can be linked to the text in publications. Since the graphical elements in a Pathway are of a very different nature than their corresponding descriptions in English text, we developed a prototype system called PathText. The goal of PathText is to serve as a bridge between these two different representations. In this paper, we first describe the overall architecture and the interfaces of the PathText system, and then provide some details about the core Text Mining components.

pdf bib
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing
Dina Demner-Fushman | Sophia Ananiadou | Kevin Bretonnel Cohen | John Pestian | Jun’ichi Tsujii | Bonnie Webber
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
Accelerating the Annotation of Sparse Named Entities by Dynamic Sentence Selection
Yoshimasa Tsuruoka | Jun’ichi Tsujii | Sophia Ananiadou
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

pdf bib
How to Make the Most of NE Dictionaries in Statistical NER
Yutaka Sasaki | Yoshimasa Tsuruoka | John McNaught | Sophia Ananiadou
Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing

2007

pdf bib
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions
Sophia Ananiadou
Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions

pdf bib
Text Mining Techniques for Building a Biolexicon
Sophia Ananiadou
Proceedings of the Australasian Language Technology Workshop 2007

2006

pdf bib abs
Clustering acronyms in biomedical text for disambiguation
Naoaki Okazaki | Sophia Ananiadou
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

Given the increasing number of neologisms in biomedicine (names of genes, diseases, molecules, etc.), the rate of acronyms used in literature also increases. Existing acronym dictionaries cannot keep up with the rate of new creations. Thus, discovering and disambiguating acronyms and their expanded forms are essential aspects of text mining and terminology management. We present a method for clustering long forms identified by an acronym recognition method. Applying the acronym recognition method to MEDLINE abstracts, we obtained a list of short/long forms. The recognized short/long forms were classified by abiologist to construct an evaluation set for clustering sets of similar long forms. We observed five types of term variation in the evaluation set and defined four similarity measures to gathers the similar longforms (i.e., orthographic, morphological, syntactic, lexico semantic variants, nested abbreviations). The complete-link clustering with the four similarity measures achieved 87.5% precision and 84.9% recall on the evaluation set.

pdf bib abs
Towards a terminological resource for biomedical text mining
Goran Nenadic | Naoki Okazaki | Sophia Ananiadou
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

One of the main challenges in biomedical text mining is the identification of terminology, which is a key factor for accessing and integrating the information stored in literature. Manual creation of biomedical terminologies cannot keep pace with the data that becomes available. Still, many of them have been used in attempts to recognise terms in literature, but their suitability for text mining has been questioned as substantial re-engineering is needed to tailor the resources for automatic processing. Several approaches have been suggested to automatically integrate and map between resources, but the problems of extensive variability of lexical representations and ambiguity have been revealed. In this paper we present a methodology to automatically maintain a biomedical terminological database, which contains automatically extracted terms, their mutual relationships, features and possible annotations that can be useful in text processing. In addition to TermDB, a database used for terminology management and storage, we present the following modules that are used to populate the database: TerMine (recognition, extraction and normalisation of terms from literature), AcroTerMine (extraction and clustering of acronyms and their long forms), AnnoTerm (annotation and classification of terms), and ClusTerm (extraction of term associations and clustering of terms).

pdf bib
A Term Recognition Approach to Acronym Recognition
Naoaki Okazaki | Sophia Ananiadou
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions