Unsupervised Sustainability Report Labeling based on the integration of the GRI and SDG standards

Seyed Alireza Mousavian Anaraki, Danilo Croce, Roberto Basili


Abstract
Sustainability reports are key instruments for communicating corporate impact, but their unstructured format and varied content pose challenges for large-scale analysis. This paper presents an unsupervised method to annotate paragraphs from sustainability reports against both the Global Reporting Initiative (GRI) and Sustainable Development Goals (SDG) standards. The approach combines structured metadata from GRI content indexes, official GRI–SDG mappings, and text semantic similarity models to produce weakly supervised annotations at scale. To evaluate the quality of these annotations, we train a multi-label classifier on the automatically labeled data and evaluate it on the trusted OSDG Community Dataset. The results show that our method yields meaningful labels and improves classification performance when combined with human-annotated data. Although preliminary, this work offers a foundation for scalable sustainability analysis and opens future directions toward assessing the credibility and depth of corporate sustainability claims.
Anthology ID:
2025.nlp4pi-1.13
Volume:
Proceedings of the Fourth Workshop on NLP for Positive Impact (NLP4PI)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Katherine Atwell, Laura Biester, Angana Borah, Daryna Dementieva, Oana Ignat, Neema Kotonya, Ziyi Liu, Ruyuan Wan, Steven Wilson, Jieyu Zhao
Venues:
NLP4PI | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
151–162
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.nlp4pi-1.13/
DOI:
10.18653/v1/2025.nlp4pi-1.13
Bibkey:
Cite (ACL):
Seyed Alireza Mousavian Anaraki, Danilo Croce, and Roberto Basili. 2025. Unsupervised Sustainability Report Labeling based on the integration of the GRI and SDG standards. In Proceedings of the Fourth Workshop on NLP for Positive Impact (NLP4PI), pages 151–162, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
Unsupervised Sustainability Report Labeling based on the integration of the GRI and SDG standards (Mousavian Anaraki et al., NLP4PI 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.nlp4pi-1.13.pdf