Villani at SemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning

Pablo Loyola, Kugamoorthy Gajananan, Yuji Watanabe, Fumiko Satoh


Abstract
In this paper, we describe our proposal for the task of Semantic Extraction from Cybersecurity Reports. The goal is to explore if natural language processing methods can provide relevant and actionable knowledge to contribute to better understand malicious behavior. Our method consists of an attention-based Bi-LSTM which achieved competitive performance of 0.57 for the Subtask 1. In the due process we also present ablation studies across multiple embeddings and their level of representation and also report the strategies we used to mitigate the extreme imbalance between classes.
Anthology ID:
S18-1143
Volume:
Proceedings of the 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
885–889
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/S18-1143/
DOI:
10.18653/v1/S18-1143
Bibkey:
Cite (ACL):
Pablo Loyola, Kugamoorthy Gajananan, Yuji Watanabe, and Fumiko Satoh. 2018. Villani at SemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 885–889, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Villani at SemEval-2018 Task 8: Semantic Extraction from Cybersecurity Reports using Representation Learning (Loyola et al., SemEval 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/S18-1143.pdf