Complex Word Identification: Convolutional Neural Network vs. Feature Engineering

Segun Taofeek Aroyehun, Jason Angel, Daniel Alejandro Pérez Alvarez, Alexander Gelbukh


Abstract
We describe the systems of NLP-CIC team that participated in the Complex Word Identification (CWI) 2018 shared task. The shared task aimed to benchmark approaches for identifying complex words in English and other languages from the perspective of non-native speakers. Our goal is to compare two approaches: feature engineering and a deep neural network. Both approaches achieved comparable performance on the English test set. We demonstrated the flexibility of the deep-learning approach by using the same deep neural network setup in the Spanish track. Our systems achieved competitive results: all our systems were within 0.01 of the system with the best macro-F1 score on the test sets except on Wikipedia test set, on which our best system is 0.04 below the best macro-F1 score.
Anthology ID:
W18-0538
Volume:
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Venue:
BEA
SIG:
SIGEDU
Publisher:
Association for Computational Linguistics
Note:
Pages:
322–327
Language:
URL:
https://aclanthology.org/W18-0538
DOI:
10.18653/v1/W18-0538
Bibkey:
Cite (ACL):
Segun Taofeek Aroyehun, Jason Angel, Daniel Alejandro Pérez Alvarez, and Alexander Gelbukh. 2018. Complex Word Identification: Convolutional Neural Network vs. Feature Engineering. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 322–327, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
Complex Word Identification: Convolutional Neural Network vs. Feature Engineering (Aroyehun et al., BEA 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/W18-0538.pdf