Eduardo Fidalgo Fernandez

Also published as: Eduardo Fidalgo Fernandez

2024

pdf bib abs
WAVE-27K: Bringing together CTI sources to enhance threat intelligence models
Felipe Castaño | Amaia Gil-Lerchundi | Raul Orduna-Urrutia | Eduardo Fidalgo Fernandez | Rocío Alaiz-Rodríguez
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security

Considering the growing flow of information on the internet, and the increased incident-related data from diverse sources, unstructured text processing gains importance. We have presented an automated approach to link several CTI sources through the mapping of external references. Our method facilitates the automatic construction of datasets, allowing for updates and the inclusion of new samples and labels. Following this method we built a new dataset of unstructured CTI descriptions called Weakness, Attack, Vulnerabilities, and Events 27k (WAVE-27k). Our dataset includes information about 27 different MITRE techniques, containing 22539 samples related one technique and 5262 related to two or more techniques simultaneously. We evaluated five BERT-based models into the WAVE-27K dataset concluding that SecRoBERTa reaches the highest performance with a 77.52% F1 score. Additionally, we compare the performance of the SecRoBERTa on the WAVE-27K dataset and other public datasets. The results show that the model using the WAVE-27K dataset outperforms the others. These results demonstrate that the data within WAVE-27K contains relevant information and that the proposed method effectively built a dataset with a level of quality sufficient to train a machine-learning model.

pdf bib abs
CECILIA: Enhancing CSIRT Effectiveness with Transformer-Based Cyber Incident Classification
Juan Jose Delgado Sotes | Alicia Martinez Mendoza | Andres Carofilis Vasco | Eduardo Fidalgo Fernandez | Enrique Alegre Gutierrez
Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security

This paper introduces an approach to improv ing incident response times by applying various Artificial Intelligence (AI) classification algorithms based on transformers to analyze the efficacy of these models in categorizing cyber incidents. As a first contribution, we developed a cyber incident dataset, CECILIA-10C-900, collecting cyber incident reports from six qualified web sources. The contribution of creating a dataset on cyber incident detection is remarkable due to the scarcity of such datasets. Each incident has been tagged by hand according to the cyber incident taxonomy defined by the CERT (Computer Emergency Response Team) of the National Institute of Cybersecurity (INCIBE). This dataset is highly unbalanced, so we decided to unify the four least represented classes under the label “others”, leaving a dataset with six categories (CECILIA-6C-900). With these reliable datasets, we performed a comparison of the best algorithms specifically for the cyber incident classification problem, evaluating eight different metrics on two conventional classifiers and six other transformer-based classifiers. Our study highlights the importance of having a rapid classification mechanism for CSIRTs (Computer Security Incident Response Teams) and showcases the potential of machine learning algorithms to improve cyber defense mechanisms. The findings from our analysis provide valuable insights into the strengths and limitations of different classification techniques. It can be used in future work on cyber incident response strategies

Co-authors

Raul Orduna-Urrutia 1

Juan Jose Delgado Sotes 1

Andres Carofilis Vasco 1

Venues

nlpaics2

Fix data