This is the coffee shop computational branding analytics dataset used in the following paper:

William Yang Wang, Edward Lin, John Kominek

"This Text Has the Scent of Starbucks: A Laplacian Structured Sparsity Model for Computational Branding Analytics", 
in Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013),
full paper, Seattle, WA, USA, Oct. 18-21, ACL. 

----------------------------------------------

There are three files:

train.data: the training set.
dev.data: the development set.
test.data: the test set.

In each of the files, there are five columns, separated by commas.

first column: 
brand 0-Starbucks, 1-Dunkin, 2-other

second column:
satisfaction 0-negative, 1-positive, 2-neutral

third column:
gender 0-female, 1-male

fourth column:
region 0-West, 1-Midwest, 2-South, 3-Northeast

fifth column:
the text of the review.

----------------------------------------------
All the documents are collected from the Web using a crawler. 
The original sources retain the copyright of the data.
There is no guarantee with this dataset.

You are allowed to use this dataset for research purposes only.

For more question about the dataset, please contact:
William Wang, ww@cmu.edu

v1.0 09/16/2013