Abstract
We explore the effect of injecting background knowledge to different deep neural network (DNN) configurations in order to mitigate the problem of the scarcity of annotated data when applying these models on datasets of low-resourced languages. The background knowledge is encoded in the form of lexicons and pre-trained sub-word embeddings. The DNN models are evaluated on the task of detecting code-switching and borrowing points in non-standardised user-generated Algerian texts. Overall results show that DNNs benefit from adding background knowledge. However, the gain varies between models and categories. The proposed DNN architectures are generic and could be applied to other low-resourced languages.- Anthology ID:
- W18-3203
- Volume:
- Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching
- Month:
- July
- Year:
- 2018
- Address:
- Melbourne, Australia
- Editors:
- Gustavo Aguilar, Fahad AlGhamdi, Victor Soto, Thamar Solorio, Mona Diab, Julia Hirschberg
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 20–28
- Language:
- URL:
- https://aclanthology.org/W18-3203
- DOI:
- 10.18653/v1/W18-3203
- Cite (ACL):
- Wafia Adouane, Jean-Philippe Bernardy, and Simon Dobnik. 2018. Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts. In Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, pages 20–28, Melbourne, Australia. Association for Computational Linguistics.
- Cite (Informal):
- Improving Neural Network Performance by Injecting Background Knowledge: Detecting Code-switching and Borrowing in Algerian texts (Adouane et al., ACL 2018)
- PDF:
- https://preview.aclanthology.org/ingest-bitext-workshop/W18-3203.pdf