The Risk and Opportunity of Data Augmentation and Translation for ESG News Impact Identification with Language Models

Yosef Ardhito Winatmoko, Ali Septiandri


Abstract
This paper presents our findings in the ML-ESG-2 task, which focused on classifying a news snippet of various languages as “Risk” or “Opportunity” in the ESG (Environmental, Social, and Governance) context. We experimented with data augmentation and translation facilitated by Large Language Models (LLM). We found that augmenting the English dataset did not help to improve the performance. By fine-tuning RoBERTa models with the original data, we achieved the top position for the English and second place for the French task. In contrast, we could achieve comparable results on the French dataset by solely using the English translation, securing the third position for the French task with only marginal F1 differences to the second-place model.
Anthology ID:
2023.finnlp-2.10
Volume:
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing
Month:
November
Year:
2023
Address:
Bali, Indonesia
Editors:
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen, Hiroki Sakaji, Kiyoshi Izumi
Venues:
FinNLP | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
66–71
Language:
URL:
https://aclanthology.org/2023.finnlp-2.10
DOI:
10.18653/v1/2023.finnlp-2.10
Bibkey:
Cite (ACL):
Yosef Ardhito Winatmoko and Ali Septiandri. 2023. The Risk and Opportunity of Data Augmentation and Translation for ESG News Impact Identification with Language Models. In Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing, pages 66–71, Bali, Indonesia. Association for Computational Linguistics.
Cite (Informal):
The Risk and Opportunity of Data Augmentation and Translation for ESG News Impact Identification with Language Models (Winatmoko & Septiandri, FinNLP-WS 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-1/2023.finnlp-2.10.pdf