Auditing Keyword Queries Over Text Documents
Bharath Kumar Reddy Apparreddy, Sailaja Rajanala, Manish Singh
Abstract
Data security and privacy is an issue of growing importance in the healthcare domain. In this paper, we present an auditing system to detect privacy violations for unstructured text documents such as healthcare records. Given a sensitive document, we present an anomaly detection algorithm that can find the top-k suspicious keyword queries that may have accessed the sensitive document. Since unstructured healthcare data, such as medical reports and query logs, are not easily available for public research, in this paper, we show how one can use the publicly available DBLP data to create an equivalent healthcare data and query log, which can then be used for experimental evaluation.- Anthology ID:
- 2021.icon-main.46
- Volume:
- Proceedings of the 18th International Conference on Natural Language Processing (ICON)
- Month:
- December
- Year:
- 2021
- Address:
- National Institute of Technology Silchar, Silchar, India
- Editors:
- Sivaji Bandyopadhyay, Sobha Lalitha Devi, Pushpak Bhattacharyya
- Venue:
- ICON
- SIG:
- Publisher:
- NLP Association of India (NLPAI)
- Note:
- Pages:
- 378–387
- Language:
- URL:
- https://aclanthology.org/2021.icon-main.46
- DOI:
- Cite (ACL):
- Bharath Kumar Reddy Apparreddy, Sailaja Rajanala, and Manish Singh. 2021. Auditing Keyword Queries Over Text Documents. In Proceedings of the 18th International Conference on Natural Language Processing (ICON), pages 378–387, National Institute of Technology Silchar, Silchar, India. NLP Association of India (NLPAI).
- Cite (Informal):
- Auditing Keyword Queries Over Text Documents (Apparreddy et al., ICON 2021)
- PDF:
- https://preview.aclanthology.org/emnlp22-frontmatter/2021.icon-main.46.pdf