Aniruddha Bala
2021
An On-device Deep-Learning Approach for Attribute Extraction from Heterogeneous Unstructured Text
Mahesh Gorijala
|
Aniruddha Bala
|
Pinaki Bhaskar
|
Krishnaditya
|
Vikram Mupparthi
Proceedings of the 18th International Conference on Natural Language Processing (ICON)
Mobile devices, with their rapidly growing usage, have turned into rich sources of user information, holding critical insights for betterment of user experience and personalization. Creating, receiving and storing important information in the form of unstructured text has become a part and parcel of daily routine of users. From purchase deliveries in Short Message Service (SMS) or Notifications, to event booking details in Calendar applications, mobile devices serve as a portal for understanding user interests, behaviours and activities through information extraction. In this paper, we address the challenge of on-device extraction of user information from unstructured data in natural language from heterogeneous sources like messages, notification, calendar etc. The issue of privacy concern is effectively eliminated by the on-device nature of the proposed solution. Our proposed solution consists of 3 components – A Na ̈ıve-Bayes based classifier for domain identification, a Dual Character andWord based Bidirectional Long Short Term Memory (Bi-LSTM) and Conditional Random Field (CRF) model for attribute extraction and a rule-based Entity Linker. Our solution achieved a 93.29% F1 score on five domains (shopping, travel, event, service and personal). Since on-device deployment has memory and latency constraints, we ensure minimal model size and optimal inference latency. To demonstrate the efficacy of our approach, we have experimented on CoNLL- 2003 dataset and achieved comparable performance to existing benchmark results.