TueSents at SemEval-2024 Task 8: Predicting the Shift from Human Authorship to Machine-generated Output in a Mixed Text

Valentin Pickard; Hoa Do

doi:10.18653/v1/2024.semeval-1.118

TueSents at SemEval-2024 Task 8: Predicting the Shift from Human Authorship to Machine-generated Output in a Mixed Text

Abstract

This paper describes our approach and resultsfor the SemEval 2024 task of identifying thetoken index in a mixed text where a switchfrom human authorship to machine-generatedtext occurs. We explore two BiLSTMs, oneover sentence feature vectors to predict theindex of the sentence containing such a changeand another over character embeddings of thetext. As sentence features, we compute tokencount, mean token length, standard deviationof token length, counts for punctuation andspace characters, various readability scores,word frequency class and word part-of-speechclass counts for each sentence. class counts.The evaluation is performed on mean absoluteerror (MAE) between predicted and actualboundary token index. While our competitionresults were notably below the baseline, theremay still be useful aspects to our approach.

Anthology ID:: 2024.semeval-1.118
Volume:: Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024)
Month:: June
Year:: 2024
Address:: Mexico City, Mexico
Editors:: Atul Kr. Ojha, A. Seza Doğruöz, Harish Tayyar Madabushi, Giovanni Da San Martino, Sara Rosenthal, Aiala Rosá
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 829–832
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.semeval-1.118/
DOI:: 10.18653/v1/2024.semeval-1.118
Bibkey:
Cite (ACL):: Valentin Pickard and Hoa Do. 2024. TueSents at SemEval-2024 Task 8: Predicting the Shift from Human Authorship to Machine-generated Output in a Mixed Text. In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), pages 829–832, Mexico City, Mexico. Association for Computational Linguistics.
Cite (Informal):: TueSents at SemEval-2024 Task 8: Predicting the Shift from Human Authorship to Machine-generated Output in a Mixed Text (Pickard & Do, SemEval 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.semeval-1.118.pdf
Supplementarymaterial:: 2024.semeval-1.118.SupplementaryMaterial.txt
Supplementarymaterial:: 2024.semeval-1.118.SupplementaryMaterial.zip

PDF Cite Search Supplementarymaterial Supplementarymaterial Fix data