Neural Transition Based Parsing of Web Queries: An Entity Based Approach

Rivka Malca, Roi Reichart


Abstract
Web queries with question intent manifest a complex syntactic structure and the processing of this structure is important for their interpretation. Pinter et al. (2016) has formalized the grammar of these queries and proposed semi-supervised algorithms for the adaptation of parsers originally designed to parse according to the standard dependency grammar, so that they can account for the unique forest grammar of queries. However, their algorithms rely on resources typically not available outside of big web corporates. We propose a new BiLSTM query parser that: (1) Explicitly accounts for the unique grammar of web queries; and (2) Utilizes named entity (NE) information from a BiLSTM NE tagger, that can be jointly trained with the parser. In order to train our model we annotate the query treebank of Pinter et al. (2016) with NEs. When trained on 2500 annotated queries our parser achieves UAS of 83.5% and segmentation F1-score of 84.5, substantially outperforming existing state-of-the-art parsers.
Anthology ID:
D18-1290
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
2700–2710
Language:
URL:
https://aclanthology.org/D18-1290
DOI:
10.18653/v1/D18-1290
Bibkey:
Cite (ACL):
Rivka Malca and Roi Reichart. 2018. Neural Transition Based Parsing of Web Queries: An Entity Based Approach. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2700–2710, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Neural Transition Based Parsing of Web Queries: An Entity Based Approach (Malca & Reichart, EMNLP 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/D18-1290.pdf