A strong baseline for question relevancy ranking

Ana González-Ledesma; Isabelle Augenstein; Anders Søgaard

doi:10.18653/v1/D18-1515

A strong baseline for question relevancy ranking

Ana Gonzalez, Isabelle Augenstein, Anders Søgaard

Abstract

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks – a task that amounts to question relevancy ranking – involve complex pipelines and manual feature engineering. Despite this, many of these still fail at beating the IR baseline, i.e., the rankings provided by Google’s search engine. We present a strong baseline for question relevancy ranking by training a simple multi-task feed forward network on a bag of 14 distance measures for the input question pair. This baseline model, which is fast to train and uses only language-independent features, outperforms the best shared task systems on the task of retrieving relevant previously asked questions.

Anthology ID:: D18-1515
Volume:: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:: October-November
Year:: 2018
Address:: Brussels, Belgium
Venue:: EMNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4810–4815
Language:
URL:: https://aclanthology.org/D18-1515
DOI:: 10.18653/v1/D18-1515
Bibkey:
Cite (ACL):: Ana Gonzalez, Isabelle Augenstein, and Anders Søgaard. 2018. A strong baseline for question relevancy ranking. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4810–4815, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):: A strong baseline for question relevancy ranking (Gonzalez et al., EMNLP 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/remove-xml-comments/D18-1515.pdf
Video:: https://vimeo.com/306167593

PDF Search Video