Jacinto Mata Vázquez


2021

pdf
Identification of profession & occupation in Health-related Social Media using tweets in Spanish
Victoria Pachón | Jacinto Mata Vázquez | Juan Luís Domínguez Olmedo
Proceedings of the Sixth Social Media Mining for Health (#SMM4H) Workshop and Shared Task

In this paper we present our approach and system description on Task 7a in ProfNer-ST: Identification of profession & occupation in Health related Social Media. Our main contribution is to show the effectiveness of using BETO-Spanish BERT as a model based on transformers pretrained with a Spanish Corpus for classification tasks. In our experiments we compared several architectures based on transformers with others based on classical machine learning algorithms. With this approach, we achieved an F1-score of 0.92 in the evaluation process.

2020

pdf
I2C at SemEval-2020 Task 12: Simple but Effective Approaches to Offensive Speech Detection in Twitter
Victoria Pachón Álvarez | Jacinto Mata Vázquez | José Manuel López Betanzos | José Luis Arjona Fernández
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the systems developed for I2C Group to participate on Subtasks A and B in English, and Subtask A in Turkish and Arabic in OffensEval (Task 12 of SemEval 2020). In our experiments we compare three architectures we have developed, two based on Transformer and the other based on classical machine learning algorithms. In this paper, the proposed architectures are described, and the results obtained by our systems are presented.

2017

pdf
Annotating Negation in Spanish Clinical Texts
Noa Cruz | Roser Morante | Manuel J. Maña López | Jacinto Mata Vázquez | Carlos L. Parra Calderón
Proceedings of the Workshop Computational Semantics Beyond Events and Roles

In this paper we present on-going work on annotating negation in Spanish clinical documents. A corpus of anamnesis and radiology reports has been annotated by two domain expert annotators with negation markers and negated events. The Dice coefficient for inter-annotator agreement is higher than 0.94 for negation markers and higher than 0.72 for negated events. The corpus will be publicly released when the annotation process is finished, constituting the first corpus annotated with negation for Spanish clinical reports available for the NLP community.