Using Discourse Structure to Differentiate Focus Entities from Background Entities in Scientific Literature

Antonio Jimeno Yepes, Ameer Albahem, Karin Verspoor


Abstract
In developing systems to identify focus entities in scientific literature, we face the problem of discriminating key entities of interest from other potentially relevant entities of the same type mentioned in the articles. We introduce the task of pathogen characterisation. We aim to discriminate mentions of biological pathogens, that are actively studied in the research presented in scientific publications. These are the pathogens that are the focus of direct experimentation in the research, rather than those that are referred to for context or as playing secondary roles. In this paper, we explore the hypothesis that these focus entities can be differentiated from other, non-actively studied, pathogens mentioned in articles through analysis of the patterns of mentions across different sections of a scientific paper, that is, using the discourse structure of the paper. We provide an indicative case study with the help of a small data set of PubMed abstracts that have been annotated with actively mentioned pathogens.
Anthology ID:
2021.alta-1.19
Volume:
Proceedings of the The 19th Annual Workshop of the Australasian Language Technology Association
Month:
December
Year:
2021
Address:
Online
Venue:
ALTA
SIG:
Publisher:
Australasian Language Technology Association
Note:
Pages:
174–178
Language:
URL:
https://aclanthology.org/2021.alta-1.19
DOI:
Bibkey:
Cite (ACL):
Antonio Jimeno Yepes, Ameer Albahem, and Karin Verspoor. 2021. Using Discourse Structure to Differentiate Focus Entities from Background Entities in Scientific Literature. In Proceedings of the The 19th Annual Workshop of the Australasian Language Technology Association, pages 174–178, Online. Australasian Language Technology Association.
Cite (Informal):
Using Discourse Structure to Differentiate Focus Entities from Background Entities in Scientific Literature (Jimeno Yepes et al., ALTA 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2021.alta-1.19.pdf