2024
pdf
abs
Environmental Impact Measurement in the MentalRiskES Evaluation Campaign
Alba M. Mármol Romero
|
Adrián Moreno-Muñoz
|
Flor Miriam Plaza-del-Arco
|
M. Dolores Molina González
|
Arturo Montejo-Ráez
Proceedings of the Second International Workshop Towards Digital Language Equality (TDLE): Focusing on Sustainability @ LREC-COLING 2024
With the rise of Large Language Models (LLMs), the NLP community is increasingly aware of the environmental consequences of model development due to the energy consumed for training and running these models. This study investigates the energy consumption and environmental impact of systems participating in the MentalRiskES shared task, at the Iberian Language Evaluation Forum (IberLEF) in the year 2023, which focuses on early risk identification of mental disorders in Spanish comments. Participants were asked to submit, for each prediction, a set of efficiency metrics, being carbon dioxide emissions among them. We conduct an empirical analysis of the data submitted considering model architecture, task complexity, and dataset characteristics, covering a spectrum from traditional Machine Learning (ML) models to advanced LLMs. Our findings contribute to understanding the ecological footprint of NLP systems and advocate for prioritizing environmental impact assessment in shared tasks to foster sustainability across diverse model types and approaches, being evaluation campaigns an adequate framework for this kind of analysis.
pdf
abs
MentalRiskES: A New Corpus for Early Detection of Mental Disorders in Spanish
Alba M. Mármol Romero
|
Adrián Moreno Muñoz
|
Flor Miriam Plaza-del-Arco
|
M. Dolores Molina González
|
María Teresa Martín Valdivia
|
L. Alfonso Ureña-López
|
Arturo Montejo Ráez
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
With mental health issues on the rise on the Web, especially among young people, there is a growing need for effective identification and intervention. In this paper, we introduce a new open-sourced corpus for the early detection of mental disorders in Spanish, focusing on eating disorders, depression, and anxiety. It consists of user messages posted on groups within the Telegram message platform and contains over 1,300 subjects with more than 45,000 messages posted in different public Telegram groups. This corpus has been manually annotated via crowdsourcing and is prepared for its use in several Natural Language Processing tasks including text classification and regression tasks. The samples in the corpus include both text and time data. To provide a benchmark for future research, we conduct experiments on text classification and regression by using state-of-the-art transformer-based models.
2020
pdf
abs
SINAI at SemEval-2020 Task 12: Offensive Language Identification Exploring Transfer Learning Models
Flor Miriam Plaza del Arco
|
M. Dolores Molina González
|
Alfonso Ureña-López
|
Maite Martin
Proceedings of the Fourteenth Workshop on Semantic Evaluation
This paper describes the participation of SINAI team at Task 12: OffensEval 2: Multilingual Offensive Language Identification in Social Media. In particular, the participation in Sub-task A in English which consists of identifying tweets as offensive or not offensive. We preprocess the dataset according to the language characteristics used on social media. Then, we select a small set from the training set provided by the organizers and fine-tune different Transformerbased models in order to test their effectiveness. Our team ranks 20th out of 85 participants in Subtask-A using the XLNet model.
2019
pdf
abs
SINAI at SemEval-2019 Task 3: Using affective features for emotion classification in textual conversations
Flor Miriam Plaza-del-Arco
|
M. Dolores Molina-González
|
Maite Martin
|
L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation
Detecting emotions in textual conversation is a challenging problem in absence of nonverbal cues typically associated with emotion, like fa- cial expression or voice modulations. How- ever, more and more users are using message platforms such as WhatsApp or Telegram. For this reason, it is important to develop systems capable of understanding human emotions in textual conversations. In this paper, we carried out different systems to analyze the emotions of textual dialogue from SemEval-2019 Task 3: EmoContext for English language. Our main contribution is the integration of emotional and sentimental features in the classification using the SVM algorithm.
pdf
abs
SINAI at SemEval-2019 Task 5: Ensemble learning to detect hate speech against inmigrants and women in English and Spanish tweets
Flor Miriam Plaza-del-Arco
|
M. Dolores Molina-González
|
Maite Martin
|
L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation
Misogyny and xenophobia are some of the most important social problems. With the in- crease in the use of social media, this feeling ofhatred towards women and immigrants can be more easily expressed, therefore it can cause harmful effects on social media users. For this reason, it is important to develop systems ca- pable of detecting hateful comments automatically. In this paper, we describe our system to analyze the hate speech in English and Spanish tweets against Immigrants and Women as part of our participation in SemEval-2019 Task 5: hatEval. Our main contribution is the integration of three individual algorithms of predic- tion in a model based on Vote ensemble classifier.
pdf
abs
SINAI at SemEval-2019 Task 6: Incorporating lexicon knowledge into SVM learning to identify and categorize offensive language in social media
Flor Miriam Plaza-del-Arco
|
M. Dolores Molina-González
|
Maite Martin
|
L. Alfonso Ureña-López
Proceedings of the 13th International Workshop on Semantic Evaluation
Offensive language has an impact across society. The use of social media has aggravated this issue among online users, causing suicides in the worst cases. For this reason, it is important to develop systems capable of identifying and detecting offensive language in text automatically. In this paper, we developed a system to classify offensive tweets as part of our participation in SemEval-2019 Task 6: OffensEval. Our main contribution is the integration of lexical features in the classification using the SVM algorithm.
2016
pdf
Domain Adaptation of Polarity Lexicon combining Term Frequency and Bootstrapping
Salud María Jiménez-Zafra
|
Maite Martin
|
M. Dolores Molina-Gonzalez
|
L. Alfonso Ureña-López
Proceedings of the 7th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis
2013
pdf
Bilingual Experiments on an Opinion Comparable Corpus
Eugenio Martínez-Cámara
|
M. Teresa Martín-Valdivia
|
M. Dolores Molina-González
|
L. Alfonso Ureña-López
Proceedings of the 4th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis