An Ensemble Based Approach To Detecting LLM-Generated Texts

Ahmed El-Sayed, Omar Nasr


Abstract
Recent advancements in Large Language models (LLMs) have empowered them to achieve text generation capabilities on par with those of humans. These recent advances paired with the wide availability of those models have made Large Language models adaptable in many domains, from scientific writing to story generation along with many others. This recent rise has made it crucial to develop systems to discriminate between human-authored and synthetic text generated by Large Language models (LLMs). Our proposed system for the ALTA shared task, based on ensembling a number of language models, claimed first place on the development set with an accuracy of 99.35% and third place on the test set with an accuracy of 98.35%.
Anthology ID:
2023.alta-1.20
Volume:
Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association
Month:
November
Year:
2023
Address:
Melbourne, Australia
Editors:
Smaranda Muresan, Vivian Chen, Kennington Casey, Vandyke David, Dethlefs Nina, Inoue Koji, Ekstedt Erik, Ultes Stefan
Venue:
ALTA
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
164–168
Language:
URL:
https://aclanthology.org/2023.alta-1.20
DOI:
Bibkey:
Cite (ACL):
Ahmed El-Sayed and Omar Nasr. 2023. An Ensemble Based Approach To Detecting LLM-Generated Texts. In Proceedings of the 21st Annual Workshop of the Australasian Language Technology Association, pages 164–168, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
An Ensemble Based Approach To Detecting LLM-Generated Texts (El-Sayed & Nasr, ALTA 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2023.alta-1.20.pdf