MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems

Arda Y\"uksel, Gabriel Thiem, Susanne Walter, Patrick Felka, Gabriela Alves Werb, Ivan Habernal


Abstract
Industry classification schemes are integral parts of public and corporate databases as they classify businesses based on economic activity. Due to the size of the company registers, manual annotation is costly, and fine-tuning models with every update in industry classification schemes requires significant data collection. We replicate the manual expert verification by using existing or easily retrievable multimodal resources for industry classification. We present MONETA, the first multimodal industry classification benchmark with text (Website, Wikipedia, Wikidata) and geospatial sources (OpenStreetMap and satellite imagery). Our dataset enlists 1,000 businesses in Europe with 20 economic activity labels according to EU guidelines (NACE). Our training-free baseline reaches 62.10% and 74.10% with open and closed-source Multimodal Large Language Models (MLLM). We observe an increase of up to 22.80% with the combination of multi-turn design, context enrichment, and classification explanations. We will release our dataset and the enhanced guidelines.
Anthology ID:
2026.acl-long.674
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
14790–14814
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.674/
DOI:
Bibkey:
Cite (ACL):
Arda Y\"uksel, Gabriel Thiem, Susanne Walter, Patrick Felka, Gabriela Alves Werb, and Ivan Habernal. 2026. MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14790–14814, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
MONETA: Multimodal Industry Classification through Geographic Information with Multi Agent Systems (Y"uksel et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.674.pdf
Checklist:
 2026.acl-long.674.checklist.pdf