UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection
Ronghao Pan, José Antonio García-díaz, Pedro José Vivancos-vicente, Rafael Valencia-garcía
Abstract
These working notes describe the UMUTeam’s participation in Task 8 of SemEval-2024 entitled “Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection”. This shared task aims at identifying machine-generated text in order to mitigate its potential misuse. This shared task is divided into three subtasks: Subtask A, a binary classification task to determine whether a given full-text was written by a human or generated by a machine; Subtask B, a multi-class classification problem to determine, given a full-text, who generated it. It can be written by a human or generated by a specific language model; and Subtask C, mixed human-machine text recognition. We participated in Subtask B, using an approach based on fine-tuning a pre-trained model, such as RoBERTa, combined with syntactic features of the texts. Our system placed 23rd out of a total of 77 participants, with a score of 75.350%, outperforming the baseline.- Anthology ID:
- 2024.semeval-1.100
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 697–702
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.100
- DOI:
- Cite (ACL):
- Ronghao Pan, José Antonio García-díaz, Pedro José Vivancos-vicente, and Rafael Valencia-garcía. 2024. UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 697–702, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- UMUTeam at SemEval-2024 Task 8: Combining Transformers and Syntax Features for Machine-Generated Text Detection (Pan et al., SemEval 2024)
- PDF:
- https://preview.aclanthology.org/ingestion-checklist/2024.semeval-1.100.pdf