Quantification of Biodiversity from Historical Survey Text with LLM-based Best-Worst-Scaling

Thomas Haider, Tobias Perschl, Malte Rehbein


Abstract
In this study, we evaluate methods to determine the frequency of species via quantity estimation from historical survey text. To that end, we formulate classification tasks and finally show that this problem can be adequately framed as a regression task using Best-Worst Scaling (BWS) with Large Language Models (LLMs). We test Ministral-8B, DeepSeek-V3, and GPT-4, finding that the latter two have reasonable agreement with humans and each other. We conclude that this approach is more cost-effective and similarly robust compared to a fine-grained multi-class approach, allowing automated quantity estimation across species.
Anthology ID:
2025.nlp4ecology-1.13
Volume:
Proceedings of the 1st Workshop on Ecology, Environment, and Natural Language Processing (NLP4Ecology2025)
Month:
march
Year:
2025
Address:
Tallinn, Estonia
Editors:
Valerio Basile, Cristina Bosco, Francesca Grasso, Muhammad Okky Ibrohim, Maria Skeppstedt, Manfred Stede
Venues:
NLP4Ecology | WS
SIG:
Publisher:
University of Tartu Library
Note:
Pages:
61–67
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.nlp4ecology-1.13/
DOI:
Bibkey:
Cite (ACL):
Thomas Haider, Tobias Perschl, and Malte Rehbein. 2025. Quantification of Biodiversity from Historical Survey Text with LLM-based Best-Worst-Scaling. In Proceedings of the 1st Workshop on Ecology, Environment, and Natural Language Processing (NLP4Ecology2025), pages 61–67, Tallinn, Estonia. University of Tartu Library.
Cite (Informal):
Quantification of Biodiversity from Historical Survey Text with LLM-based Best-Worst-Scaling (Haider et al., NLP4Ecology 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.nlp4ecology-1.13.pdf