A Privacy Preserving Data Publishing Middleware for Unstructured, Textual Social Media Data

Prasadi Abeywardana, Uthayasanker Thayasivam


Abstract
Privacy is going to be an integral part of data science and analytics in the coming years. The next hype of data experimentation is going to be heavily dependent on privacy preserving techniques mainly as it’s going to be a legal responsibility rather than a mere social responsibility. Privacy preservation becomes more challenging specially in the context of unstructured data. Social networks have become predominantly popular over the past couple of decades and they are creating a huge data lake at a high velocity. Social media profiles contain a wealth of personal and sensitive information, creating enormous opportunities for third parties to analyze them with different algorithms, draw conclusions and use in disinformation campaigns and micro targeting based dark advertising. This study provides a mitigation mechanism for disinformation campaigns that are done based on the insights extracted from personal/sensitive data analysis. Specifically, this research is aimed at building a privacy preserving data publishing middleware for unstructured social media data without compromising the true analytical value of those data. A novel way is proposed to apply traditional structured privacy preserving techniques on unstructured data. Creating a comprehensive twitter corpus annotated with privacy attributes is another objective of this research, especially because the research community is lacking one.
Anthology ID:
2020.stoc-1.4
Volume:
Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management
Month:
May
Year:
2020
Address:
Marseille, France
Venue:
STOC
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
21–28
Language:
English
URL:
https://aclanthology.org/2020.stoc-1.4
DOI:
Bibkey:
Cite (ACL):
Prasadi Abeywardana and Uthayasanker Thayasivam. 2020. A Privacy Preserving Data Publishing Middleware for Unstructured, Textual Social Media Data. In Proceedings for the First International Workshop on Social Threats in Online Conversations: Understanding and Management, pages 21–28, Marseille, France. European Language Resources Association.
Cite (Informal):
A Privacy Preserving Data Publishing Middleware for Unstructured, Textual Social Media Data (Abeywardana & Thayasivam, STOC 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2020.stoc-1.4.pdf