@inproceedings{li-etal-2016-emotion,
    title = "Emotion Corpus Construction Based on Selection from Hashtags",
    author = "Li, Minglei  and
      Long, Yunfei  and
      Qin, Lu  and
      Li, Wenjie",
    editor = "Calzolari, Nicoletta  and
      Choukri, Khalid  and
      Declerck, Thierry  and
      Goggi, Sara  and
      Grobelnik, Marko  and
      Maegaard, Bente  and
      Mariani, Joseph  and
      Mazo, Helene  and
      Moreno, Asuncion  and
      Odijk, Jan  and
      Piperidis, Stelios",
    booktitle = "Proceedings of the Tenth International Conference on Language Resources and Evaluation ({LREC}'16)",
    month = may,
    year = "2016",
    address = "Portoro{\v{z}}, Slovenia",
    publisher = "European Language Resources Association (ELRA)",
    url = "https://preview.aclanthology.org/add-orcids-2023-acl/L16-1291/",
    pages = "1845--1849",
    abstract = "The availability of labelled corpus is of great importance for supervised learning in emotion classification tasks. Because it is time-consuming to manually label text, hashtags have been used as naturally annotated labels to obtain a large amount of labelled training data from microblog. However, natural hashtags contain too much noise for it to be used directly in learning algorithms. In this paper, we design a three-stage semi-automatic method to construct an emotion corpus from microblogs. Firstly, a lexicon based voting approach is used to verify the hashtag automatically. Secondly, a SVM based classifier is used to select the data whose natural labels are consistent with the predicted labels. Finally, the remaining data will be manually examined to filter out the noisy data. Out of about 48K filtered Chinese microblogs, 39k microblogs are selected to form the final corpus with the Kappa value reaching over 0.92 for the automatic parts and over 0.81 for the manual part. The proportion of automatic selection reaches 54.1{\%}. Thus, the method can reduce about 44.5{\%} of manual workload for acquiring quality data. Experiment on a classifier trained on this corpus shows that it achieves comparable results compared to the manually annotated NLP{\&}CC2013 corpus."
}Markdown (Informal)
[Emotion Corpus Construction Based on Selection from Hashtags](https://preview.aclanthology.org/add-orcids-2023-acl/L16-1291/) (Li et al., LREC 2016)
ACL
- Minglei Li, Yunfei Long, Lu Qin, and Wenjie Li. 2016. Emotion Corpus Construction Based on Selection from Hashtags. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pages 1845–1849, Portorož, Slovenia. European Language Resources Association (ELRA).