Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments

Muhammad Ali; Salman Khan

Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments

Abstract

Recent advancements in Large Language Models (LLMs) have paved the way for VisionLarge Language Models (VLLMs) capable ofperforming a wide range of visual understand-ing tasks. While LLMs have demonstrated impressive performance on standard naturalimages, their capabilities have not been thoroughly explored in cluttered datasets where there is complex environment having deformedshaped objects. In this work, we introduce a novel dataset specifically designed for waste classification in real-world scenarios, character-ized by complex environments and deformed shaped objects. Along with this dataset, we present an in-depth evaluation approach to rig-orously assess the robustness and accuracy of VLLMs. The introduced dataset and comprehensive analysis provide valuable insights intothe performance of VLLMs under challenging conditions. Our findings highlight the critical need for further advancements in VLLM’s ro-bustness to perform better in complex enviroments. The dataset and code for our experiments are available at https://github.com/aliman80/wastebench.

Anthology ID:: 2025.emnlp-main.1578
Volume:: Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2025
Address:: Suzhou, China
Editors:: Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, Violet Peng
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 31007–31020
Language:
URL:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1578/
DOI:
Bibkey:
Cite (ACL):: Muhammad Ali and Salman Khan. 2025. Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 31007–31020, Suzhou, China. Association for Computational Linguistics.
Cite (Informal):: Waste-Bench: A Comprehensive Benchmark for Evaluating VLLMs in Cluttered Environments (Ali & Khan, EMNLP 2025)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingest-emnlp/2025.emnlp-main.1578.pdf
Checklist:: 2025.emnlp-main.1578.checklist.pdf

PDF Cite Search Checklist Fix data