Abstract
Text retrieval systems often return large sets of documents, particularly when applied to large collections. Stopping criteria can reduce the number of these documents that need to be manually evaluated for relevance by predicting when a suitable level of recall has been achieved. In this work, a novel method for determining a stopping criterion is proposed that models the rate at which relevant documents occur using a Poisson process. This method allows a user to specify both a minimum desired level of recall to achieve and a desired probability of having achieved it. We evaluate our method on a public dataset and compare it with previous techniques for determining stopping criteria.- Anthology ID:
- D19-1351
- Volume:
- Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
- Venues:
- EMNLP | IJCNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3484–3489
- Language:
- URL:
- https://aclanthology.org/D19-1351
- DOI:
- 10.18653/v1/D19-1351
- Cite (ACL):
- Alison Sneyd and Mark Stevenson. 2019. Modelling Stopping Criteria for Search Results using Poisson Processes. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 3484–3489, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Modelling Stopping Criteria for Search Results using Poisson Processes (Sneyd & Stevenson, EMNLP-IJCNLP 2019)
- PDF:
- https://preview.aclanthology.org/ml4al-ingestion/D19-1351.pdf
- Code
- alisonsneyd/poisson_stopping_method