Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts

Wafia Adouane, Jean-Philippe Bernardy, Simon Dobnik


Abstract
We explore the effect of injecting background knowledge to different deep neural network (DNN) configurations in order to mitigate the problem of the scarcity of annotated data when applying these models on datasets of low-resourced languages. The background knowledge is encoded in the form of lexicons and pre-trained sub-word embeddings. The DNN models are evaluated on the task of detecting code-switching and borrowing points in non-standardised user-generated Algerian texts. Overall results show that DNNs benefit from adding background knowledge. However, the gain varies between models and categories. The proposed DNN architectures are generic and could be applied to other low-resourced languages.
Anthology ID:
W18-3203
Volume:
Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
Month:
July
Year:
2018
Address:
Melbourne, Australia
Editors:
Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Thamar Solorio, Mona Diab, Julia Hirschberg
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20–28
Language:
URL:
https://aclanthology.org/W18-3203
DOI:
10.18653/v1/W18-3203
Bibkey:
Cite (ACL):
Wafia Adouane, Jean-Philippe Bernardy, and Simon Dobnik. 2018. Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pages 20–28, Melbourne, Australia. Association for Computational Linguistics.
Cite (Informal):
Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts (Adouane et al., ACL 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/W18-3203.pdf