Alberto Gutiérrez Megías
Also published as: Alberto Gutiérrez-Megías
2024
SINAI at SemEval-2024 Task 8: Fine-tuning on Words and Perplexity as Features for Detecting Machine Written Text
Alberto Gutiérrez Megías
|
L. Alfonso Ureña-lópez
|
Eugenio Martínez Cámara
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
This work presents the proposed systems of the SINAI team for the subtask A of the Task 8 in SemEval 2024. We present the evaluation of two disparate systems, and our final submitted system. We claim that the perplexity value of a text may be used as classification signal. Accordingly, we conduct a study on the utility of perplexity for discerning text authorship, and we perform a comparative analysis of the results obtained on the datasets of the task. This comparative evaluation includes results derived from the systems evaluated, such as fine-tuning using an XLM-RoBERTa-Large transformer or using perplexity as a classification criterion. In addition, we discuss the results reached on the test set, where we show that there is large differences among the language probability distribution of the training and test sets. These analysis allows us to open new research lines to improve the detection of machine-generated text.
Smart Lexical Search for Label Flipping Adversial Attack
Alberto Gutiérrez-Megías
|
Salud María Jiménez-Zafra
|
L. Alfonso Ureña
|
Eugenio Martínez-Cámara
Proceedings of the Fifth Workshop on Privacy in Natural Language Processing
Language models are susceptible to vulnerability through adversarial attacks, using manipulations of the input data to disrupt their performance. Accordingly, it represents a cibersecurity leak. Data manipulations are intended to be unidentifiable by the learning model and by humans, small changes can disturb the final label of a classification task. Hence, we propose a novel attack built upon explainability methods to identify the salient lexical units to alter in order to flip the classification label. We asses our proposal on a disinformation dataset, and we show that our attack reaches high balance among stealthiness and efficiency.