2022
pdf
abs
The Financial Narrative Summarisation Shared Task (FNS 2022)
Mahmoud El-Haj
|
Nadhem Zmandar
|
Paul Rayson
|
Ahmed AbuRa’ed
|
Marina Litvak
|
Nikiforos Pittaras
|
George Giannakopoulos
|
Aris Kosmopoulos
|
Blanca Carbajo-Coronado
|
Antonio Moreno-Sandoval
Proceedings of the 4th Financial Narrative Processing Workshop @LREC2022
This paper presents the results and findings of the Financial Narrative Summarisation Shared Task on summarising UK, Greek and Spanish annual reports. The shared task was organised as part of the Financial Narrative Processing 2022 Workshop (FNP 2022 Workshop). The Financial Narrative summarisation Shared Task (FNS-2022) has been running since 2020 as part of the Financial Narrative Processing (FNP) workshop series (El-Haj et al., 2022; El-Haj et al., 2021; El-Haj et al., 2020b; El-Haj et al., 2019c; El-Haj et al., 2018). The shared task included one main task which is the use of either abstractive or extractive automatic summarisers to summarise long documents in terms of UK, Greek and Spanish financial annual reports. This shared task is the third to target financial documents. The data for the shared task was created and collected from publicly available annual reports published by firms listed on the Stock Exchanges of UK, Greece and Spain. A total number of 14 systems from 7 different teams participated in the shared task.
2021
pdf
The Financial Narrative Summarisation Shared Task FNS 2021
Nadhem Zmandar
|
Mahmoud El-Haj
|
Paul Rayson
|
Ahmed Abura’Ed
|
Marina Litvak
|
Geroge Giannakopoulos
|
Nikiforos Pittaras
Proceedings of the 3rd Financial Narrative Processing Workshop
pdf
abs
Cartography of Natural Language Processing for Social Good (NLP4SG): Searching for Definitions, Statistics and White Spots
Paula Fortuna
|
Laura Pérez-Mayos
|
Ahmed AbuRa’ed
|
Juan Soler-Company
|
Leo Wanner
Proceedings of the 1st Workshop on NLP for Positive Impact
The range of works that can be considered as developing NLP for social good (NLP4SG) is enormous. While many of them target the identification of hate speech or fake news, there are others that address, e.g., text simplification to alleviate consequences of dyslexia, or coaching strategies to fight depression. However, so far, there is no clear picture of what areas are targeted by NLP4SG, who are the actors, which are the main scenarios and what are the topics that have been left aside. In order to obtain a clearer view in this respect, we first propose a working definition of NLP4SG and identify some primary aspects that are crucial for NLP4SG, including, e.g., areas, ethics, privacy and bias. Then, we draw upon a corpus of around 50,000 articles downloaded from the ACL Anthology. Based on a list of keywords retrieved from the literature and revised in view of the task, we select from this corpus articles that can be considered to be on NLP4SG according to our definition and analyze them in terms of trends along the time line, etc. The result is a map of the current NLP4SG research and insights concerning the white spots on this map.
2020
pdf
abs
A Multi-level Annotated Corpus of Scientific Papers for Scientific Document Summarization and Cross-document Relation Discovery
Ahmed AbuRa’ed
|
Horacio Saggion
|
Luis Chiruzzo
Proceedings of the Twelfth Language Resources and Evaluation Conference
Related work sections or literature reviews are an essential part of every scientific article being crucial for paper reviewing and assessment. The automatic generation of related work sections can be considered an instance of the multi-document summarization problem. In order to allow the study of this specific problem, we have developed a manually annotated, machine readable data-set of related work sections, cited papers (e.g. references) and sentences, together with an additional layer of papers citing the references. We additionally present experiments on the identification of cited sentences, using as input citation contexts. The corpus alongside the gold standard are made available for use by the scientific community.
pdf
bib
abs
The Financial Narrative Summarisation Shared Task (FNS 2020)
Mahmoud El-Haj
|
Ahmed AbuRa’ed
|
Marina Litvak
|
Nikiforos Pittaras
|
George Giannakopoulos
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation
This paper presents the results and findings of the Financial Narrative Summarisation shared task (FNS 2020) on summarising UK annual reports. The shared task was organised as part of the 1st Financial Narrative Processing and Financial Narrative Summarisation Workshop (FNP-FNS 2020). The shared task included one main task which is the use of either abstractive or extractive summarisation methodologies and techniques to automatically summarise UK financial annual reports. FNS summarisation shared task is the first to target financial annual reports. The data for the shared task was created and collected from publicly available UK annual reports published by firms listed on the London Stock Exchange (LSE). A total number of 24 systems from 9 different teams participated in the shared task. In addition we had 2 baseline summarisers and additional 2 topline summarisers to help evaluate and compare against the results of the participants.
2019
pdf
abs
OlloBot - Towards A Text-Based Arabic Health Conversational Agent: Evaluation and Results
Ahmed Fadhil
|
Ahmed AbuRa’ed
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
We introduce OlloBot, an Arabic conversational agent that assists physicians and supports patients with the care process. It doesn’t replace the physicians, instead provides health tracking and support and assists physicians with the care delivery through a conversation medium. The current model comprises healthy diet, physical activity, mental health, in addition to food logging. Not only OlloBot tracks user daily food, it also offers useful tips for healthier living. We will discuss the design, development and testing of OlloBot, and highlight the findings and limitations arose from the testing.
2018
pdf
abs
LaSTUS/TALN at Complex Word Identification (CWI) 2018 Shared Task
Ahmed AbuRa’ed
|
Horacio Saggion
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
This paper presents the participation of the LaSTUS/TALN team in the Complex Word Identification (CWI) Shared Task 2018 in the English monolingual track . The purpose of the task was to determine if a word in a given sentence can be judged as complex or not by a certain target audience. For the English track, task organizers provided a training and a development datasets of 27,299 and 3,328 words respectively together with the sentence in which each word occurs. The words were judged as complex or not by 20 human evaluators; ten of whom are natives. We submitted two systems: one system modeled each word to evaluate as a numeric vector populated with a set of lexical, semantic and contextual features while the other system relies on a word embedding representation and a distance metric. We trained two separate classifiers to automatically decide if each word is complex or not. We submitted six runs, two for each of the three subsets of the English monolingual CWI track.
2017
pdf
bib
abs
What Sentence are you Referring to and Why? Identifying Cited Sentences in Scientific Literature
Ahmed AbuRa’ed
|
Luis Chiruzzo
|
Horacio Saggion
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
In the current context of scientific information overload, text mining tools are of paramount importance for researchers who have to read scientific papers and assess their value. Current citation networks, which link papers by citation relationships (reference and citing paper), are useful to quantitatively understand the value of a piece of scientific work, however they are limited in that they do not provide information about what specific part of the reference paper the citing paper is referring to. This qualitative information is very important, for example, in the context of current community-based scientific summarization activities. In this paper, and relying on an annotated dataset of co-citation sentences, we carry out a number of experiments aimed at, given a citation sentence, automatically identify a part of a reference paper being cited. Additionally our algorithm predicts the specific reason why such reference sentence has been cited out of five possible reasons.
2016
pdf
Trainable Citation-enhanced Summarization of Scientific Articles
Horacio Saggion
|
Ahmed AbuRa’ed
|
Francesco Ronzano
Proceedings of the Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL)
pdf
TALN at SemEval-2016 Task 11: Modelling Complex Words by Contextual, Lexical and Semantic Features
Francesco Ronzano
|
Ahmed Abura’ed
|
Luis Espinosa-Anke
|
Horacio Saggion
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)