Esteban Castillo


2020

pdf bib
Active Defense Against Social Engineering: The Case for Human Language Technology
Adam Dalton | Ehsan Aghaei | Ehab Al-Shaer | Archna Bhatia | Esteban Castillo | Zhuo Cheng | Sreekar Dhaduvai | Qi Duan | Bryanna Hebenstreit | Md Mazharul Islam | Younes Karimi | Amir Masoumzadeh | Brodie Mather | Sashank Santhanam | Samira Shaikh | Alan Zemel | Tomek Strzalkowski | Bonnie J. Dorr
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management

We describe a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. The system processes modern message formats through a plug-in architecture to accommodate innovative approaches for message analysis, knowledge representation and dialogue generation. The novelty of the system is that it uses NLP for cyber defense and engages the attacker using bots to elicit evidence to attribute to the attacker and to waste the attacker’s time and resources.

pdf
Email Threat Detection Using Distinct Neural Network Approaches
Esteban Castillo | Sreekar Dhaduvai | Peng Liu | Kartik-Singh Thakur | Adam Dalton | Tomek Strzalkowski
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management

This paper describes different approaches to detect malicious content in email interactions through a combination of machine learning and natural language processing tools. Specifically, several neural network designs are tested on word embedding representations to detect suspicious messages and separate them from non-suspicious, benign email. The proposed approaches are trained and tested on distinct email collections, including datasets constructed from publicly available corpora (such as Enron, APWG, etc.) as well as several smaller, non-public datasets used in recent government evaluations. Experimental results show that back-propagation both with and without recurrent neural layers outperforms current state of the art techniques that include supervised learning algorithms with stylometric elements of texts as features. Our results also demonstrate that word embedding vectors are effective means for capturing certain aspects of text meaning that can be teased out through machine learning in non-linear/complex neural networks, in order to obtain highly accurate detection of malicious emails based on email text alone.

2016

pdf
UDLAP at SemEval-2016 Task 4: Sentiment Quantification Using a Graph Based Representation
Esteban Castillo | Ofelia Cervantes | Darnes Vilariño | David Báez
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)

2015

pdf
UDLAP: Sentiment Analysis Using a Graph-Based Representation
Esteban Castillo | Ofelia Cervantes | Darnes Vilariño | David Báez | Alfredo Sánchez
Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)

2012

pdf
FCC: Three Approaches for Semantic Textual Similarity
Maya Carrillo | Darnes Vilariño | David Pinto | Mireya Tovar | Saul León | Esteban Castillo
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)

pdf
BUAP: Lexical and Semantic Similarity for Cross-lingual Textual Entailment
Darnes Vilariño | David Pinto | Mireya Tovar | Saul León | Esteban Castillo
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)