Abstract
In this article we present an extended version of PolEmo – a corpus of consumer reviews from 4 domains: medicine, hotels, products and school. Current version (PolEmo 2.0) contains 8,216 reviews having 57,466 sentences. Each text and sentence was manually annotated with sentiment in 2+1 scheme, which gives a total of 197,046 annotations. We obtained a high value of Positive Specific Agreement, which is 0.91 for texts and 0.88 for sentences. PolEmo 2.0 is publicly available under a Creative Commons copyright license. We explored recent deep learning approaches for the recognition of sentiment, such as Bi-directional Long Short-Term Memory (BiLSTM) and Bidirectional Encoder Representations from Transformers (BERT).- Anthology ID:
- K19-1092
- Volume:
- Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Mohit Bansal, Aline Villavicencio
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 980–991
- Language:
- URL:
- https://aclanthology.org/K19-1092
- DOI:
- 10.18653/v1/K19-1092
- Cite (ACL):
- Jan Kocoń, Piotr Miłkowski, and Monika Zaśko-Zielińska. 2019. Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 980–991, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Multi-Level Sentiment Analysis of PolEmo 2.0: Extended Corpus of Multi-Domain Consumer Reviews (Kocoń et al., CoNLL 2019)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/K19-1092.pdf
- Data
- PolEmo 2.0