Tongrui Li
2018
The University of Texas System Submission for the Code-Switching Workshop Shared Task 2018
Florian Janke
|
Tongrui Li
|
Eric Rincón
|
Gualberto Guzmán
|
Barbara Bullock
|
Almeida Jacqueline Toribio
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
This paper describes the system for the Named Entity Recognition Shared Task of the Third Workshop on Computational Approaches to Linguistic Code-Switching (CALCS) submitted by the Bilingual Annotations Tasks (BATs) research group of the University of Texas. Our system uses several features to train a Conditional Random Field (CRF) model for classifying input words as Named Entities (NEs) using the Inside-Outside-Beginning (IOB) tagging scheme. We participated in the Modern Standard Arabic-Egyptian Arabic (MSA-EGY) and English-Spanish (ENG-SPA) tasks, achieving weighted average F-scores of 65.62 and 54.16 respectively. We also describe the performance of a deep neural network (NN) trained on a subset of the CRF features, which did not surpass CRF performance.