Not Everything Is Online Grooming: False Risk Finding in Large Language Model Assessments of Human Conversations

Ellie Prosser, Matthew Edwards


Abstract
Large Language Models (LLMs) have rapidly been adopted by the general public, and as usage of these models becomes commonplace, they naturally will be used for increasingly human-centric tasks, including security advice and risk identification for personal situations. It is imperative that systems used in such a manner are well-calibrated. In this paper, 6 popular LLMs were evaluated for their propensity towards false or over-cautious risk finding in online interactions between real people, with a focus on the risk of online grooming, the advice generated for such contexts, and the impact of prompt specificity. Through an analysis of 3840 generated answers, it was found that models could find online grooming in even the most harmless of interactions, and that the generated advice could be harmful, judgemental, and controlling. We describe these shortcomings, and identify areas for improvement, including suggestions for future research directions.
Anthology ID:
2024.nlpaics-1.24
Volume:
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
Month:
July
Year:
2024
Address:
Lancaster, UK
Editors:
Ruslan Mitkov, Saad Ezzini, Tharindu Ranasinghe, Ignatius Ezeani, Nouran Khallaf, Cengiz Acarturk, Matthew Bradbury, Mo El-Haj, Paul Rayson
Venue:
NLPAICS
SIG:
Publisher:
International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security
Note:
Pages:
220–229
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.nlpaics-1.24/
DOI:
Bibkey:
Cite (ACL):
Ellie Prosser and Matthew Edwards. 2024. Not Everything Is Online Grooming: False Risk Finding in Large Language Model Assessments of Human Conversations. In Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security, pages 220–229, Lancaster, UK. International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security.
Cite (Informal):
Not Everything Is Online Grooming: False Risk Finding in Large Language Model Assessments of Human Conversations (Prosser & Edwards, NLPAICS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/2024.nlpaics-1.24.pdf