Abstract
Sentiment analysis is the computational task of extracting sentiment from a text document – for example whether it expresses a positive, negative or neutral opinion. Various approaches have been introduced in recent years, using a range of different techniques to extract sentiment information from a document. Measuring these methods against a gold standard dataset is a useful way to evaluate such systems. However, different sentiment analysis techniques represent sentiment values in different ways, such as discrete categorical classes or continuous numerical sentiment scores. This creates a challenge for evaluating and comparing such systems; in particular assessing numerical scores against datasets that use fixed classes is difficult, because the numerical outputs have to be mapped onto the ordered classes. This paper proposes a novel calibration technique that uses precision vs. recall curves to set class thresholds to optimize a continuous sentiment analyser’s performance against a discrete gold standard dataset. In experiments mapping a continuous score onto a three-class classification of movie reviews, we show that calibration results in a substantial increase in f-score when compared to a non-calibrated mapping.- Anthology ID:
- R17-1084
- Volume:
- Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
- Month:
- September
- Year:
- 2017
- Address:
- Varna, Bulgaria
- Editors:
- Ruslan Mitkov, Galia Angelova
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 652–660
- Language:
- URL:
- https://doi.org/10.26615/978-954-452-049-6_084
- DOI:
- 10.26615/978-954-452-049-6_084
- Cite (ACL):
- F. Sharmila Satthar, Roger Evans, and Gulden Uchyigit. 2017. A Calibration Method for Evaluation of Sentiment Analysis. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 652–660, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- A Calibration Method for Evaluation of Sentiment Analysis (Satthar et al., RANLP 2017)
- PDF:
- https://doi.org/10.26615/978-954-452-049-6_084