Topic Taxonomy Construction from ESG Reports

Saif Majdi AlNajjar, Xinyu Wang, Yulan He


Abstract
The surge in Environmental, Societal, and Governance (ESG) reports, essential for corporate transparency and modern investments, presents a challenge for investors due to their varying lengths and sheer volume. We present a novel methodology, called MultiTaxoGen, for creating topic taxonomies designed specifically for analysing the ESG reports. Topic taxonomies serve to illustrate topics covered in a corpus of ESG reports while also highlighting the hierarchical relationships between them. Unfortunately, current state-of-the-art approaches for constructing topic taxonomies are designed for more general datasets, resulting in ambiguous topics and the omission of many latent topics presented in ESG-focused corpora. This makes them unsuitable for the specificity required by investors. Our method instead adapts topic modelling techniques by employing them recursively on each topic’s local neighbourhood, the subcorpus of documents assigned to that topic. This iterative approach allows us to identify the children topics and offers a better understanding of topic hierarchies in a fine-grained paradigm. Our findings reveal that our method captures more latent topics in our ESG report corpus than the leading method and provides more coherent topics with comparable relational accuracy.
Anthology ID:
2024.finnlp-1.17
Volume:
Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Chung-Chi Chen, Xiaomo Liu, Udo Hahn, Armineh Nourbakhsh, Zhiqiang Ma, Charese Smiley, Veronique Hoste, Sanjiv Ranjan Das, Manling Li, Mohammad Ghassemi, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen
Venues:
FinNLP | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
178–187
Language:
URL:
https://aclanthology.org/2024.finnlp-1.17
DOI:
Bibkey:
Cite (ACL):
Saif Majdi AlNajjar, Xinyu Wang, and Yulan He. 2024. Topic Taxonomy Construction from ESG Reports. In Proceedings of the Joint Workshop of the 7th Financial Technology and Natural Language Processing, the 5th Knowledge Discovery from Unstructured Data in Financial Services, and the 4th Workshop on Economics and Natural Language Processing @ LREC-COLING 2024, pages 178–187, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Topic Taxonomy Construction from ESG Reports (AlNajjar et al., FinNLP-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-2/2024.finnlp-1.17.pdf