TueCICL at SemEval-2024 Task 8: Resource-efficient approaches for machine-generated text detection

Daniel Stuhlinger, Aron Winkler


Abstract
Recent developments in the field of NLP have brought large language models (LLMs) to the forefront of both public and research attention. As the use of language generation technologies becomes more widespread, the problem arises of determining whether a given text is machine generated or not. Task 8 at SemEval 2024 consists of a shared task with this exact objective. Our approach aims at developing models and strategies that strike a good balance between performance and model size. We show that it is possible to compete with large transformer-based solutions with smaller systems.
Anthology ID:
2024.semeval-1.227
Volume:
Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:
June
Year:
2024
Address:
Mexico City, Mexico
Editors:
Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1597–1601
Language:
URL:
https://aclanthology.org/2024.semeval-1.227
DOI:
10.18653/v1/2024.semeval-1.227
Bibkey:
Cite (ACL):
Daniel Stuhlinger and Aron Winkler. 2024. TueCICL at SemEval-2024 Task 8: Resource-efficient approaches for machine-generated text detection. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 1597–1601, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):
TueCICL at SemEval-2024 Task 8: Resource-efficient approaches for machine-generated text detection (Stuhlinger & Winkler, SemEval 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2024.semeval-1.227.pdf
Supplementary material:
 2024.semeval-1.227.SupplementaryMaterial.zip
Supplementary material:
 2024.semeval-1.227.SupplementaryMaterial.txt