Modelling Stopping Criteria for Search Results using Poisson Processes

Alison Sneyd, Mark Stevenson


Abstract
Text retrieval systems often return large sets of documents, particularly when applied to large collections. Stopping criteria can reduce the number of these documents that need to be manually evaluated for relevance by predicting when a suitable level of recall has been achieved. In this work, a novel method for determining a stopping criterion is proposed that models the rate at which relevant documents occur using a Poisson process. This method allows a user to specify both a minimum desired level of recall to achieve and a desired probability of having achieved it. We evaluate our method on a public dataset and compare it with previous techniques for determining stopping criteria.
Anthology ID:
D19-1351
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:
EMNLP | IJCNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
3484–3489
Language:
URL:
https://aclanthology.org/D19-1351
DOI:
10.18653/v1/D19-1351
Bibkey:
Cite (ACL):
Alison Sneyd and Mark Stevenson. 2019. Modelling Stopping Criteria for Search Results using Poisson Processes. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3484–3489, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Modelling Stopping Criteria for Search Results using Poisson Processes (Sneyd & Stevenson, EMNLP-IJCNLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ml4al-ingestion/D19-1351.pdf
Code
 alisonsneyd/poisson_stopping_method