Hadeel Saadany


2022

pdf
PLOD: An Abbreviation Detection Dataset for Scientific Documents
Leonardo Zilio | Hadeel Saadany | Prashant Sharma | Diptesh Kanojia | Constantin Orăsan
Proceedings of the Thirteenth Language Resources and Evaluation Conference

The detection and extraction of abbreviations from unstructured texts can help to improve the performance of Natural Language Processing tasks, such as machine translation and information retrieval. However, in terms of publicly available datasets, there is not enough data for training deep-neural-networks-based models to the point of generalising well over data. This paper presents PLOD, a large-scale dataset for abbreviation detection and extraction that contains 160k+ segments automatically annotated with abbreviations and their long forms. We performed manual validation over a set of instances and a complete automatic validation for this dataset. We then used it to generate several baseline models for detecting abbreviations and long forms. The best models achieved an F1-score of 0.92 for abbreviations and 0.89 for detecting their corresponding long forms. We release this dataset along with our code and all the models publicly at https://github.com/surrey-nlp/PLOD-AbbreviationDetection

pdf
SURREY-CTS-NLP at WASSA2022: An Experiment of Discourse and Sentiment Analysis for the Prediction of Empathy, Distress and Emotion
Shenbin Qian | Constantin Orasan | Diptesh Kanojia | Hadeel Saadany | Félix Do Carmo
Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis

This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion. In this work, we tested different learning strategies, like ensemble learning and multi-task learning, as well as several large language models, but our primary focus was on analysing and extracting emotion-intensive features from both the essays in the training data and the news articles, to better predict empathy and distress scores from the perspective of discourse and sentiment analysis. We propose several text feature extraction schemes to compensate the small size of training examples for fine-tuning pretrained language models, including methods based on Rhetorical Structure Theory (RST) parsing, cosine similarity and sentiment score. Our best submissions achieve an average Pearson correlation score of 0.518 for the empathy prediction task and an F1 score of 0.571 for the emotion prediction task, indicating that using these schemes to extract emotion-intensive information can help improve model performance.

2021

pdf
Sentiment-Aware Measure (SAM) for Evaluating Sentiment Transfer by Machine Translation Systems
Hadeel Saadany | Constantin Orăsan | Emad Mohamed | Ashraf Tantavy
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

In translating text where sentiment is the main message, human translators give particular attention to sentiment-carrying words. The reason is that an incorrect translation of such words would miss the fundamental aspect of the source text, i.e. the author’s sentiment. In the online world, MT systems are extensively used to translate User-Generated Content (UGC) such as reviews, tweets, and social media posts, where the main message is often the author’s positive or negative attitude towards the topic of the text. It is important in such scenarios to accurately measure how far an MT system can be a reliable real-life utility in transferring the correct affect message. This paper tackles an under-recognized problem in the field of machine translation evaluation which is judging to what extent automatic metrics concur with the gold standard of human evaluation for a correct translation of sentiment. We evaluate the efficacy of conventional quality metrics in spotting a mistranslation of sentiment, especially when it is the sole error in the MT output. We propose a numerical “sentiment-closeness” measure appropriate for assessing the accuracy of a translated affect message in UGC text by an MT system. We will show that incorporating this sentiment-aware measure can significantly enhance the correlation of some available quality metrics with the human judgement of an accurate translation of sentiment.

pdf
BLEU, METEOR, BERTScore: Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-Oriented Text
Hadeel Saadany | Constantin Orasan
Proceedings of the Translation and Interpreting Technology Online Conference

Social media companies as well as censorship authorities make extensive use of artificial intelligence (AI) tools to monitor postings of hate speech, celebrations of violence or profanity. Since AI software requires massive volumes of data to train computers, automatic-translation of the online content is usually implemented to compensate for the scarcity of text in some languages. However, machine translation (MT) mistakes are a regular occurrence when translating sentiment-oriented user-generated content (UGC), especially when a low-resource language is involved. In such scenarios, the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly. In this paper, we assess the ability of automatic quality metrics to detect critical machine translation errors which can cause serious misunderstanding of the affect message. We compare the performance of three canonical metrics on meaningless translations as compared to meaningful translations with a critical error that distorts the overall sentiment of the source text. We demonstrate the need for the fine-tuning of automatic metrics to make them more robust in detecting sentiment critical errors.

2020

pdf
Is it Great or Terrible? Preserving Sentiment in Neural Machine Translation of Arabic Reviews
Hadeel Saadany | Constantin Orasan
Proceedings of the Fifth Arabic Natural Language Processing Workshop

Since the advent of Neural Machine Translation (NMT) approaches there has been a tremendous improvement in the quality of automatic translation. However, NMT output still lacks accuracy in some low-resource languages and sometimes makes major errors that need extensive postediting. This is particularly noticeable with texts that do not follow common lexico-grammatical standards, such as user generated content (UGC). In this paper we investigate the challenges involved in translating book reviews from Arabic into English, with particular focus on the errors that lead to incorrect translation of sentiment polarity. Our study points to the special characteristics of Arabic UGC, examines the sentiment transfer errors made by Google Translate of Arabic UGC to English, analyzes why the problem occurs, and proposes an error typology specific of the translation of Arabic UGC. Our analysis shows that the output of online translation tools of Arabic UGC can either fail to transfer the sentiment at all by producing a neutral target text, or completely flips the sentiment polarity of the target word or phrase and hence delivers a wrong affect message. We address this problem by fine-tuning an NMT model with respect to sentiment polarity showing that this approach can significantly help with correcting sentiment errors detected in the online translation of Arabic UGC.

pdf
Fake or Real? A Study of Arabic Satirical Fake News
Hadeel Saadany | Constantin Orasan | Emad Mohamed
Proceedings of the 3rd International Workshop on Rumours and Deception in Social Media (RDSM)

One very common type of fake news is satire which comes in a form of a news website or an online platform that parodies reputable real news agencies to create a sarcastic version of reality. This type of fake news is often disseminated by individuals on their online platforms as it has a much stronger effect in delivering criticism than through a straightforward message. However, when the satirical text is disseminated via social media without mention of its source, it can be mistaken for real news. This study conducts several exploratory analyses to identify the linguistic properties of Arabic fake news with satirical content. It shows that although it parodies real news, Arabic satirical news has distinguishing features on the lexico-grammatical level. We exploit these features to build a number of machine learning models capable of identifying satirical fake news with an accuracy of up to 98.6%. The study introduces a new dataset (3185 articles) scraped from two Arabic satirical news websites (‘Al-Hudood’ and ‘Al-Ahram Al-Mexici’) which consists of fake news. The real news dataset consists of 3710 articles collected from three official news sites: the ‘BBC-Arabic’, the ‘CNN-Arabic’ and ‘Al-Jazeera news’. Both datasets are concerned with political issues related to the Middle East.