Knowledge Distillation with BERT for Image Tag-Based Privacy Prediction

Chenye Zhao, Cornelia Caragea


Abstract
Text in the form of tags associated with online images is often informative for predicting private or sensitive content from images. When using privacy prediction systems running on social networking sites that decide whether each uploaded image should get posted or be protected, users may be reluctant to share real images that may reveal their identity but may share image tags. In such cases, privacy-aware tags become good indicators of image privacy and can be utilized to generate privacy decisions. In this paper, our aim is to learn tag representations for images to improve tag-based image privacy prediction. To achieve this, we explore self-distillation with BERT, in which we utilize knowledge in the form of soft probability distributions (soft labels) from the teacher model to help with the training of the student model. Our approach effectively learns better tag representations with improved performance on private image identification and outperforms state-of-the-art models for this task. Moreover, we utilize the idea of knowledge distillation to improve tag representations in a semi-supervised learning task. Our semi-supervised approach with only 20% of annotated data achieves similar performance compared with its supervised learning counterpart. Last, we provide a comprehensive analysis to get a better understanding of our approach.
Anthology ID:
2021.ranlp-1.181
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
Month:
September
Year:
2021
Address:
Held Online
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1616–1625
Language:
URL:
https://aclanthology.org/2021.ranlp-1.181
DOI:
Bibkey:
Cite (ACL):
Chenye Zhao and Cornelia Caragea. 2021. Knowledge Distillation with BERT for Image Tag-Based Privacy Prediction. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1616–1625, Held Online. INCOMA Ltd..
Cite (Informal):
Knowledge Distillation with BERT for Image Tag-Based Privacy Prediction (Zhao & Caragea, RANLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/auto-file-uploads/2021.ranlp-1.181.pdf