Salud María Jiménez-Zafra

Also published as: Salud M. Jiménez Zafra, Salud M. Jiménez-Zafra

2020

pdf bib abs
Corpora Annotated with Negation: An Overview
Salud María Jiménez-Zafra | Roser Morante | María Teresa Martín-Valdivia | L. Alfonso Ureña-López
Computational Linguistics, Volume 46, Issue 1 - March 2020

Negation is a universal linguistic phenomenon with a great qualitative impact on natural language processing applications. The availability of corpora annotated with negation is essential to training negation processing systems. Currently, most corpora have been annotated for English, but the presence of languages other than English on the Internet, such as Chinese or Spanish, is greater every day. In this study, we present a review of the corpora annotated with negation information in several languages with the goal of evaluating what aspects of negation have been annotated and how compatible the corpora are. We conclude that it is very difficult to merge the existing corpora because we found differences in the annotation schemes used, and most importantly, in the annotation guidelines: the way in which each corpus was tokenized and the negation elements that have been annotated. Differently than for other well established tasks like semantic role labeling or parsing, for negation there is no standard annotation scheme nor guidelines, which hampers progress in its treatment.

pdf bib abs
Detecting Negation Cues and Scopes in Spanish
Salud María Jiménez-Zafra | Roser Morante | Eduardo Blanco | María Teresa Martín Valdivia | L. Alfonso Ureña López
Proceedings of the 12th Language Resources and Evaluation Conference

In this work we address the processing of negation in Spanish. We first present a machine learning system that processes negation in Spanish. Specifically, we focus on two tasks: i) negation cue detection and ii) scope identification. The corpus used in the experimental framework is the SFU Corpus. The results for cue detection outperform state-of-the-art results, whereas for scope detection this is the first system that performs the task for Spanish. Moreover, we provide a qualitative error analysis aimed at understanding the limitations of the system and showing which negation cues and scopes are straightforward to predict automatically, and which ones are challenging.

2019

pdf bib abs
SINAI-DL at SemEval-2019 Task 5: Recurrent networks and data augmentation by paraphrasing
Arturo Montejo-Ráez | Salud María Jiménez-Zafra | Miguel A. García-Cumbreras | Manuel Carlos Díaz-Galiano
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the participation of the SINAI-DL team at Task 5 in SemEval 2019, called HatEval. We have applied some classic neural network layers, like word embeddings and LSTM, to build a neural classifier for both proposed tasks. Due to the small amount of training data provided compared to what is expected for an adequate learning stage in deep architectures, we explore the use of paraphrasing tools as source for data augmentation. Our results show that this method is promising, as some improvement has been found over non-augmented training sets.

pdf bib abs
SINAI-DL at SemEval-2019 Task 7: Data Augmentation and Temporal Expressions
Miguel A. García-Cumbreras | Salud María Jiménez-Zafra | Arturo Montejo-Ráez | Manuel Carlos Díaz-Galiano | Estela Saquete
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the participation of the SINAI-DL team at RumourEval (Task 7 in SemEval 2019, subtask A: SDQC). SDQC addresses the challenge of rumour stance classification as an indirect way of identifying potential rumours. Given a tweet with several replies, our system classifies each reply into either supporting, denying, questioning or commenting on the underlying rumours. We have applied data augmentation, temporal expressions labelling and transfer learning with a four-layer neural classifier. We achieve an accuracy of 0.715 with the official run over reply tweets.

2018

pdf bib abs
A review of Spanish corpora annotated with negation
Salud María Jiménez-Zafra | Roser Morante | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 27th International Conference on Computational Linguistics

The availability of corpora annotated with negation information is essential to develop negation processing systems in any language. However, there is a lack of these corpora even for languages like English, and when there are corpora available they are small and the annotations are not always compatible across corpora. In this paper we review the existing corpora annotated with negation in Spanish with the purpose of first, gathering the information to make it available for other researchers and, second, analyzing how compatible are the corpora and how has the linguistic phenomenon been addressed. Our final aim is to develop a supervised negation processing system for Spanish, for which we need training and test data. Our analysis shows that it will not be possible to merge the small corpora existing for Spanish due to lack of compatibility in the annotations.

pdf bib abs
SINAI at SemEval-2018 Task 1: Emotion Recognition in Tweets
Flor Miriam Plaza-del-Arco | Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López
Proceedings of The 12th International Workshop on Semantic Evaluation

Emotion classification is a new task that combines several disciplines including Artificial Intelligence and Psychology, although Natural Language Processing is perhaps the most challenging area. In this paper, we describe our participation in SemEval-2018 Task1: Affect in Tweets. In particular, we have participated in EI-oc, EI-reg and E-c subtasks for English and Spanish languages.

2017

pdf bib abs
SINAI at SemEval-2017 Task 4: User based classification
Salud María Jiménez-Zafra | Arturo Montejo-Ráez | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This document describes our participation in SemEval-2017 Task 4: Sentiment Analysis in Twitter. We have only reported results for subtask B - English, determining the polarity towards a topic on a two point scale (positive or negative sentiment). Our main contribution is the integration of user information in the classification process. A SVM model is trained with Word2Vec vectors from user’s tweets extracted from his timeline. The obtained results show that user-specific classifiers trained on tweets from user timeline can introduce noise as they are error prone because they are classified by an imperfect system. This encourages us to explore further integration of user information for author-based Sentiment Analysis.

2016

pdf bib
Domain Adaptation of Polarity Lexicon combining Term Frequency and Bootstrapping
Salud María Jiménez-Zafra | Maite Martin | M. Dolores Molina-Gonzalez | L. Alfonso Ureña-López
Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

pdf bib abs
Problematic Cases in the Annotation of Negation in Spanish
Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López | Toni Martí | Mariona Taulé
Proceedings of the Workshop on Extra-Propositional Aspects of Meaning in Computational Linguistics (ExProM)

This paper presents the main sources of disagreement found during the annotation of the Spanish SFU Review Corpus with negation (SFU ReviewSP -NEG). Negation detection is a challenge in most of the task related to NLP, so the availability of corpora annotated with this phenomenon is essential in order to advance in tasks related to this area. A thorough analysis of the problems found during the annotation could help in the study of this phenomenon.

2015

pdf bib
SINAI: Syntactic Approach for Aspect-Based Sentiment Analysis
Salud M. Jiménez-Zafra | Eugenio Martínez-Cámara | M. Teresa Martín-Valdivia | L. Alfonso Ureña-López
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

pdf bib
A Multi-lingual Annotated Dataset for Aspect-Oriented Opinion Mining
Salud M. Jiménez Zafra | Giacomo Berardi | Andrea Esuli | Diego Marcheggiani | María Teresa Martín-Valdivia | Alejandro Moreo Fernández
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing

2014

pdf bib
SINAI: Voting System for Aspect Based Sentiment Analysis
Salud María Jiménez-Zafra | Eugenio Martínez-Cámara | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)

pdf bib
SINAI: Voting System for Twitter Sentiment Analysis
Eugenio Martínez-Cámara | Salud María Jiménez-Zafra | Maite Martin | L. Alfonso Ureña-López
Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014)