UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection

Ronghao Pan, José Antonio García-díaz, Pedro José Vivancos-vicente, Rafael Valencia-garcía


Abstract
These working notes describe the UMUTeam’s participation in Task 8 of SemEval-2024 entitled “Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection”. This shared task aims at identifying machine-generated text in order to mitigate its potential misuse. This shared task is divided into three subtasks: Subtask A, a binary classification task to determine whether a given full-text was written by a human or generated by a machine; Subtask B, a multi-class classification problem to determine, given a full-text, who generated it. It can be written by a human or generated by a specific language model; and Subtask C, mixed human-machine text recognition. We participated in Subtask B, using an approach based on fine-tuning a pre-trained model, such as RoBERTa, combined with syntactic features of the texts. Our system placed 23rd out of a total of 77 participants, with a score of 75.350%, outperforming the baseline.
Anthology ID:
2024.semeval-1.100
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
697–702
Language:
URL:
https://aclanthology.org/2024.semeval-1.100
DOI:
Bibkey:
Cite (ACL):
Ronghao Pan, José Antonio García-díaz, Pedro José Vivancos-vicente, and Rafael Valencia-garcía. 2024. UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 697–702, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection (Pan et al., SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-checklist/2024.semeval-1.100.pdf
Supplementary material:
 2024.semeval-1.100.SupplementaryMaterial.zip
Supplementary material:
 2024.semeval-1.100.SupplementaryMaterial.txt