Tongrui Li


2018

pdf
The University of Texas System Submission for the Code-Switching Workshop Shared Task 2018
Florian Janke | Tongrui Li | Eric Rincón | Gualberto Guzmán | Barbara Bullock | Almeida Jacqueline Toribio
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching

This paper describes the system for the Named Entity Recognition Shared Task of the Third Workshop on Computational Approaches to Linguistic Code-Switching (CALCS) submitted by the Bilingual Annotations Tasks (BATs) research group of the University of Texas. Our system uses several features to train a Conditional Random Field (CRF) model for classifying input words as Named Entities (NEs) using the Inside-Outside-Beginning (IOB) tagging scheme. We participated in the Modern Standard Arabic-Egyptian Arabic (MSA-EGY) and English-Spanish (ENG-SPA) tasks, achieving weighted average F-scores of 65.62 and 54.16 respectively. We also describe the performance of a deep neural network (NN) trained on a subset of the CRF features, which did not surpass CRF performance.