This is the dataset used in the paper (Chen et al., ACL 2014). It contains 36 domain reviews, which have been preprocessed. Each domain contains 1,000 reviews. Each sentence is treated as one document. For the text before preprocessing (zip file of larger size), please visit the website http://www.cs.uic.edu/~zchen/.

If you use this dataset, please cite the following paper (Chen et al., ACL 2014).

Reference

Zhiyuan Chen, Arjun Mukherjee, Bing Liu. Aspect Extraction with Automated Prior Knowledge Learning. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014).

