Abstract
In this paper, we propose using a “bootstrapping” method for constructing a dependency treebank of Arabic tweets. This method uses a rule-based parser to create a small treebank of one thousand Arabic tweets and a data-driven parser to create a larger treebank by using the small treebank as a seed training set. We are able to create a dependency treebank from unlabelled tweets without any manual intervention. Experiments results show that this method can improve the speed of training the parser and the accuracy of the resulting parsers.- Anthology ID:
- W17-1312
- Volume:
- Proceedings of the Third Arabic Natural Language Processing Workshop
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Nizar Habash, Mona Diab, Kareem Darwish, Wassim El-Hajj, Hend Al-Khalifa, Houda Bouamor, Nadi Tomeh, Mahmoud El-Haj, Wajdi Zaghouani
- Venue:
- WANLP
- SIG:
- SEMITIC
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 94–99
- Language:
- URL:
- https://aclanthology.org/W17-1312
- DOI:
- 10.18653/v1/W17-1312
- Cite (ACL):
- Fahad Albogamy, Allan Ramsay, and Hanady Ahmed. 2017. Arabic Tweets Treebanking and Parsing: A Bootstrapping Approach. In Proceedings of the Third Arabic Natural Language Processing Workshop, pages 94–99, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Arabic Tweets Treebanking and Parsing: A Bootstrapping Approach (Albogamy et al., WANLP 2017)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/W17-1312.pdf