The Influence of the Perplexity Score in the Detection of Machine-generated Texts
Alberto José Gutiérrez Megías, L. Alfonso Ureña-López, Eugenio Martínez Cámara
Abstract
The high performance of large language models (LLM) generating natural language represents a real threat, since they can be leveraged to generate any kind of deceptive content. Since there are still disparities among the language generated by machines and the human language, we claim that perplexity may be used as classification signal to discern between machine and human text. We propose a classification model based on XLM-RoBERTa, and we evaluate it on the M4 dataset. The results show that the perplexity score is useful for the identification of machine generated text, but it is constrained by the differences among the LLMs used in the training and test sets.- Anthology ID:
- 2024.nlpaics-1.10
- Volume:
- Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
- Month:
- July
- Year:
- 2024
- Address:
- Lancaster, UK
- Editors:
- Ruslan Mitkov, Saad Ezzini, Tharindu Ranasinghe, Ignatius Ezeani, Nouran Khallaf, Cengiz Acarturk, Matthew Bradbury, Mo El-Haj, Paul Rayson
- Venue:
- NLPAICS
- SIG:
- Publisher:
- International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
- Note:
- Pages:
- 80–85
- Language:
- URL:
- https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1.10/
- DOI:
- Cite (ACL):
- Alberto José Gutiérrez Megías, L. Alfonso Ureña-López, and Eugenio Martínez Cámara. 2024. The Influence of the Perplexity Score in the Detection of Machine-generated Texts. In Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security, pages 80–85, Lancaster, UK. International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security.
- Cite (Informal):
- The Influence of the Perplexity Score in the Detection of Machine-generated Texts (Gutiérrez Megías et al., NLPAICS 2024)
- PDF:
- https://preview.aclanthology.org/fix-sig-urls/2024.nlpaics-1.10.pdf