SemEval Task 8: A Comparison of Traditional and Neural Models for Detecting Machine Authored Text
Srikar Kashyap Pulipaka, Shrirang Mhalgi, Joseph Larson, Sandra Kübler
Abstract
Since Large Language Models have reached a stage where it is becoming more and more difficult to distinguish between human and machine written text, there is an increasing need for automated systems to distinguish between them. As part of SemEval Task 8, Subtask A: Binary Human-Written vs. Machine-Generated Text Classification, we explore a variety of machine learning classifiers, from traditional statistical methods, such as Naïve Bayes and Decision Trees, to fine-tuned transformer models, suchas RoBERTa and ALBERT. Our findings show that using a fine-tuned RoBERTa model with optimizedhyperparameters yields the best accuracy. However, the improvement does not translate to the test set because of the differences in distribution in the development and test sets.- Anthology ID:
- 2024.semeval-1.148
- Volume:
- Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
- Month:
- June
- Year:
- 2024
- Address:
- Mexico City, Mexico
- Editors:
- Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1026–1031
- Language:
- URL:
- https://aclanthology.org/2024.semeval-1.148
- DOI:
- 10.18653/v1/2024.semeval-1.148
- Cite (ACL):
- Srikar Kashyap Pulipaka, Shrirang Mhalgi, Joseph Larson, and Sandra Kübler. 2024. SemEval Task 8: A Comparison of Traditional and Neural Models for Detecting Machine Authored Text. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1026–1031, Mexico City, Mexico. Association for Computational Linguistics.
- Cite (Informal):
- SemEval Task 8: A Comparison of Traditional and Neural Models for Detecting Machine Authored Text (Pulipaka et al., SemEval 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.148.pdf