Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data
Irshad Bhat, Riyaz A. Bhat, Manish Shrivastava, Dipti Sharma
Abstract
In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Due to lack of an evaluation set for code-mixed structures, we also present a data set of 450 Hindi and English code-mixed tweets of Hindi multilingual speakers for evaluation.- Anthology ID:
- E17-2052
- Volume:
- Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers
- Month:
- April
- Year:
- 2017
- Address:
- Valencia, Spain
- Editors:
- Mirella Lapata, Phil Blunsom, Alexander Koller
- Venue:
- EACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 324–330
- Language:
- URL:
- https://aclanthology.org/E17-2052
- DOI:
- Cite (ACL):
- Irshad Bhat, Riyaz A. Bhat, Manish Shrivastava, and Dipti Sharma. 2017. Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, pages 324–330, Valencia, Spain. Association for Computational Linguistics.
- Cite (Informal):
- Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data (Bhat et al., EACL 2017)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/E17-2052.pdf