Alejandro Beltrán


2020

pdf
Supervised Event Coding from Text Written in Arabic: Introducing Hadath
Javier Osorio | Alejandro Reyes | Alejandro Beltrán | Atal Ahmadzai
Proceedings of the Workshop on Automated Extraction of Socio-political Events from News 2020

This article introduces Hadath, a supervised protocol for coding event data from text written in Arabic. Hadath contributes to recent efforts in advancing multi-language event coding using computer-based solutions. In this application, we focus on extracting event data about the conflict in Afghanistan from 2008 to 2018 using Arabic information sources. The implementation relies first on a Machine Learning algorithm to classify news stories relevant to the Afghan conflict. Then, using Hadath, we implement the Natural Language Processing component for event coding from Arabic script. The output database contains daily geo-referenced information at the district level on who did what to whom, when and where in the Afghan conflict. The data helps to identify trends in the dynamics of violence, the provision of governance, and traditional conflict resolution in Afghanistan for different actors over time and across space.