Stigma Annotation Scheme and Stigmatized Language Detection in Health-Care Discussions on Social Media

Nadiya Straton, Hyeju Jang, Raymond Ng


Abstract
Much research has been done within the social sciences on the interpretation and influence of stigma on human behaviour and health, which result in out-of-group exclusion, distancing, cognitive separation, status loss, discrimination, in-group pressure, and often lead to disengagement, non-adherence to treatment plan, and prescriptions by the doctor. However, little work has been conducted on computational identification of stigma in general and in social media discourse in particular. In this paper, we develop the annotation scheme and improve the annotation process for stigma identification, which can be applied to other health-care domains. The data from pro-vaccination and anti-vaccination discussion groups are annotated by trained annotators who have professional background in social science and health-care studies, therefore the group can be considered experts on the subject in comparison to non-expert crowd. Amazon MTurk annotators is another group of annotator with no knowledge on their education background, they are initially treated as non-expert crowd on the subject matter of stigma. We analyze the annotations with visualisation techniques, features from LIWC (Linguistic Inquiry and Word Count) list and make prediction based on bi-grams with traditional and deep learning models. Data augmentation method and application of CNN show high performance accuracy in comparison to other models. Success of the rigorous annotation process on identifying stigma is reconfirmed by achieving high prediction rate with CNN.
Anthology ID:
2020.lrec-1.148
Volume:
Proceedings of the Twelfth Language Resources and Evaluation Conference
Month:
May
Year:
2020
Address:
Marseille, France
Editors:
Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
1178–1190
Language:
English
URL:
https://aclanthology.org/2020.lrec-1.148
DOI:
Bibkey:
Cite (ACL):
Nadiya Straton, Hyeju Jang, and Raymond Ng. 2020. Stigma Annotation Scheme and Stigmatized Language Detection in Health-Care Discussions on Social Media. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 1178–1190, Marseille, France. European Language Resources Association.
Cite (Informal):
Stigma Annotation Scheme and Stigmatized Language Detection in Health-Care Discussions on Social Media (Straton et al., LREC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2020.lrec-1.148.pdf