Abstract
We recently witnessed an exponential growth in dialectal Arabic usage in both textual data and speech recordings especially in social media. Processing such media is of great utility for all kinds of applications ranging from information extraction to social media analytics for political and commercial purposes to building decision support systems. Compared to other languages, Arabic, especially the informal variety, poses a significant challenge to natural language processing algorithms since it comprises multiple dialects, linguistic code switching, and a lack of standardized orthographies, to top its relatively complex morphology. Inherently, the problem of processing Arabic in the context of social media is the problem of how to handle resource poor languages. In this talk I will go over some of our insights to some of these problems and show how there is a silver lining where we can generalize some of our solutions to other low resource language contexts.- Anthology ID:
- W16-4805
- Volume:
- Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3)
- Month:
- December
- Year:
- 2016
- Address:
- Osaka, Japan
- Editors:
- Preslav Nakov, Marcos Zampieri, Liling Tan, Nikola Ljubešić, Jörg Tiedemann, Shervin Malmasi
- Venue:
- VarDial
- SIG:
- Publisher:
- The COLING 2016 Organizing Committee
- Note:
- Pages:
- 42
- Language:
- URL:
- https://aclanthology.org/W16-4805
- DOI:
- Cite (ACL):
- Mona Diab. 2016. Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities. In Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3), page 42, Osaka, Japan. The COLING 2016 Organizing Committee.
- Cite (Informal):
- Processing Dialectal Arabic: Exploiting Variability and Similarity to Overcome Challenges and Discover Opportunities (Diab, VarDial 2016)
- PDF:
- https://preview.aclanthology.org/ingest-acl-2023-videos/W16-4805.pdf