Luis Borges Gouveia

2024

pdf bib abs
A Proposal Framework Security Assessment for Large Language Models
Daniel Mendonça Colares | Raimir Holanda Filho | Luis Borges Gouveia
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security

Large Language Models (LLMs), despite their numerous applications and the significant benefits they offer, have proven to be extremely susceptible to attacks of various natures. Due to their large number of vulnerabilities, often unknown, and which consequently become potential targets for attacks, investing in the implementation of this technology becomes a gamble. Ensuring the security of LLMs is of utmost importance, but unfortunately, providing effective security for so many different vulnerabilities is a costly task, especially for companies seeking rapid growth. Many studies focus on analyzing the security of LLMs for specific types of vulnerabilities, such as prompt inject or jailbreaking, but they rarely assess the security of the model as a whole. Therefore, this study aims to facilitate the evaluation of vulnerabilities across various models and identify their main weaknesses. To achieve this, our work sought to develop a comprehensive framework capable of utilizing various scanners to assess the security of LLMs, allowing for a detailed analysis of their vulnerabilities. Through the use of the framework, we tested and evaluated multiple models, and with the results collected from these assessments of various vulnerabilities for each model tested, we analyzed the obtained data. Our results not only demonstrated potential weaknesses in certain models but also revealed a possible relationship between model security and the number of parameters for similar models.

Co-authors

Venues

nlpaics1

Fix data