Abstract
This paper introduces the first attempt to investigate morphological segmentation on En-Ar bilingual word embeddings using bilingual word embeddings model without word alignment (BilBOWA). We investigate the effect of sentence length and embedding size on the learning process. Our experiment shows that using the D3 segmentation scheme improves the accuracy of learning bilingual word embeddings up to 10 percentage points compared to the ATB and D0 schemes in all different training settings.- Anthology ID:
- W19-4611
- Volume:
- Proceedings of the Fourth Arabic Natural Language Processing Workshop
- Month:
- August
- Year:
- 2019
- Address:
- Florence, Italy
- Editors:
- Wassim El-Hajj, Lamia Hadrich Belguith, Fethi Bougares, Walid Magdy, Imed Zitouni, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
- Venue:
- WANLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 97–107
- Language:
- URL:
- https://aclanthology.org/W19-4611
- DOI:
- 10.18653/v1/W19-4611
- Cite (ACL):
- Taghreed Alqaisi and Simon O’Keefe. 2019. En-Ar Bilingual Word Embeddings without Word Alignment: Factors Effects. In Proceedings of the Fourth Arabic Natural Language Processing Workshop, pages 97–107, Florence, Italy. Association for Computational Linguistics.
- Cite (Informal):
- En-Ar Bilingual Word Embeddings without Word Alignment: Factors Effects (Alqaisi & O’Keefe, WANLP 2019)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/W19-4611.pdf