Combining Counting Processes and Classification Improves a Stopping Rule for Technology Assisted Review

Reem Bin-Hezam, Mark Stevenson


Abstract
Technology Assisted Review (TAR) stopping rules aim to reduce the cost of manually assessing documents for relevance by minimising the number of documents that need to be examined to ensure a desired level of recall. This paper extends an effective stopping rule using information derived from a text classifier that can be trained without the need for any additional annotation. Experiments on multiple data sets (CLEF e-Health, TREC Total Recall, TREC Legal and RCV1) showed that the proposed approach consistently improves performance and outperforms several alternative methods.
Anthology ID:
2023.findings-emnlp.171
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2023
Month:
December
Year:
2023
Address:
Singapore
Editors:
Houda Bouamor, Juan Pino, Kalika Bali
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2603–2609
Language:
URL:
https://aclanthology.org/2023.findings-emnlp.171
DOI:
10.18653/v1/2023.findings-emnlp.171
Bibkey:
Cite (ACL):
Reem Bin-Hezam and Mark Stevenson. 2023. Combining Counting Processes and Classification Improves a Stopping Rule for Technology Assisted Review. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 2603–2609, Singapore. Association for Computational Linguistics.
Cite (Informal):
Combining Counting Processes and Classification Improves a Stopping Rule for Technology Assisted Review (Bin-Hezam & Stevenson, Findings 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-2024-clasp/2023.findings-emnlp.171.pdf