Abstract
We present a system for automating Semantic Role Labelling of Hindi-English code-mixed tweets. We explore the issues posed by noisy, user generated code-mixed social media data. We also compare the individual effect of various linguistic features used in our system. Our proposed model is a 2-step system for automated labelling which gives an overall accuracy of 84% for Argument Classification, marking a 10% increase over the existing rule-based baseline model. This is the first attempt at building a statistical Semantic Role Labeller for Hindi-English code-mixed data, to the best of our knowledge.- Anthology ID:
- D19-5538
- Volume:
- Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
- Venue:
- WNUT
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 291–296
- Language:
- URL:
- https://aclanthology.org/D19-5538
- DOI:
- 10.18653/v1/D19-5538
- Cite (ACL):
- Riya Pal and Dipti Sharma. 2019. Towards Automated Semantic Role Labelling of Hindi-English Code-Mixed Tweets. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 291–296, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Towards Automated Semantic Role Labelling of Hindi-English Code-Mixed Tweets (Pal & Sharma, WNUT 2019)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/D19-5538.pdf