ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model

Raki Lachraf; El Moatez Billah Nagoudi; Youcef Ayachi; Ahmed Abdelali; Didier Schwab

doi:10.18653/v1/W19-4605

ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model

Raki Lachraf, El Moatez Billah Nagoudi, Youcef Ayachi, Ahmed Abdelali, Didier Schwab

[How to correct problems with metadata yourself]

Abstract

Word Embeddings (WE) are getting increasingly popular and widely applied in many Natural Language Processing (NLP) applications due to their effectiveness in capturing semantic properties of words; Machine Translation (MT), Information Retrieval (IR) and Information Extraction (IE) are among such areas. In this paper, we propose an open source ArbEngVec which provides several Arabic-English cross-lingual word embedding models. To train our bilingual models, we use a large dataset with more than 93 million pairs of Arabic-English parallel sentences. In addition, we perform both extrinsic and intrinsic evaluations for the different word embedding model variants. The extrinsic evaluation assesses the performance of models on the cross-language Semantic Textual Similarity (STS), while the intrinsic evaluation is based on the Word Translation (WT) task.

Anthology ID:: W19-4605
Volume:: Proceedings of the Fourth Arabic Natural Language Processing Workshop
Month:: August
Year:: 2019
Address:: Florence, Italy
Editors:: Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
Venue:: WANLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 40–48
Language:
URL:: https://aclanthology.org/W19-4605
DOI:: 10.18653/v1/W19-4605
Bibkey:
Cite (ACL):: Raki Lachraf, El Moatez Billah Nagoudi, Youcef Ayachi, Ahmed Abdelali, and Didier Schwab. 2019. ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 40–48, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):: ArbEngVec : Arabic-English Cross-Lingual Word Embedding Model (Lachraf et al., WANLP 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/teach-a-man-to-fish/W19-4605.pdf

PDF Search