Making Travel Smarter: Extracting Travel Information From Email Itineraries Using Named Entity Recognition

Divyansh Kaushik, Shashank Gupta, Chakradhar Raju, Reuben Dias, Sanjib Ghosh


Abstract
The purpose of this research is to address the problem of extracting information from travel itineraries and discuss the challenges faced in the process. Business-to-customer emails like booking confirmations and e-tickets are usually machine generated by filling slots in pre-defined templates which improve the presentation of such emails but also make the emails more complex in structure. Extracting the relevant information from these emails would let users track their journeys and important updates on applications installed on their devices to give them a consolidated over view of their itineraries and also save valuable time. We investigate the use of an HMM-based named entity recognizer on such emails which we will use to label and extract relevant entities. NER in such emails is challenging as these itineraries offer less useful contextual information. We also propose a rich set of features which are integrated into the model and are specific to our domain. The result from our model is a list of lists containing the relevant information extracted from ones itinerary.
Anthology ID:
R17-1047
Volume:
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
354–362
Language:
URL:
https://doi.org/10.26615/978-954-452-049-6_047
DOI:
10.26615/978-954-452-049-6_047
Bibkey:
Cite (ACL):
Divyansh Kaushik, Shashank Gupta, Chakradhar Raju, Reuben Dias, and Sanjib Ghosh. 2017. Making Travel Smarter: Extracting Travel Information From Email Itineraries Using Named Entity Recognition. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 354–362, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Making Travel Smarter: Extracting Travel Information From Email Itineraries Using Named Entity Recognition (Kaushik et al., RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-049-6_047