Financial Technology and Natural Language Processing (2023)


up

pdf (full)
bib (full)
Proceedings of the Fifth Workshop on Financial Technology and Natural Language Processing and the Second Multimodal AI For Financial Forecasting


up

pdf (full)
bib (full)
Proceedings of the Sixth Workshop on Financial Technology and Natural Language Processing

Natural language processing (NLP) has recently gained relevance within financial institutions by providing highly valuable insights into companies and markets’ financial documents. However, the landscape of the financial domain presents extra challenges for NLP, due to the complexity of the texts and the use of specific terminology. Generalist language models tend to fall short in tasks specifically tailored for finance, even when using large language models (LLMs) with great natural language understanding and generative capabilities. This paper presents a study on LLM adaptation methods targeted at the financial domain and with high emphasis on financial sentiment analysis. To this purpose, two foundation models with less than 1.5B parameters have been adapted using a wide range of strategies. We show that through careful fine-tuning on both financial documents and instructions, these foundation models can be adapted to the target domain. Moreover, we observe that small LLMs have comparable performance to larger scale models, while being more efficient in terms of parameters and data. In addition to the models, we show how to generate artificial instructions through LLMs to augment the number of samples of the instruction dataset.
In this paper, we present ECL, a novel multimodal dataset containing the textual and numerical data from corporate 10K filings and associated binary bankruptcy labels. Furthermore, we develop and critically evaluate several classical and neural bankruptcy prediction models using this dataset. Our findings suggest that the information contained in each data modality is complementary for bankruptcy prediction. We also see that the binary bankruptcy prediction target does not enable our models to distinguish next year bankruptcy from an unhealthy financial situation resulting in bankruptcy in later years. Finally, we explore the use of LLMs in the context of our task. We show how GPT-based models can be used to extract meaningful summaries from the textual data but zero-shot bankruptcy prediction results are poor. All resources required to access and update the dataset or replicate our experiments are available on github.com/henriarnoUG/ECL.
The purpose of this paper is to construct a model for the generation of sophisticated headlines pertaining to stock price fluctuation articles, derived from the articles’ content. With respect to this headline generation objective, this paper solves three distinct tasks: in addition to the task of generating article headlines, two other tasks of extracting security names, and ascertaining the trajectory of stock prices, whether they are rising or declining. Regarding the headline generation task, we also revise the task as the model utilizes the outcomes of the security name extraction and rise/decline determination tasks, thereby for the purpose of preventing the inclusion of erroneous security names. We employed state-of-the-art pre-trained models from the field of natural language processing, fine-tuning these models for each task to enhance their precision. The dataset utilized for fine-tuning comprises a collection of articles delineating the rise and decline of stock prices. Consequently, we achieved remarkably high accuracy in the dual tasks of security name extraction and stock price rise or decline determination. For the headline generation task, a significant portion of the test data yielded fitting headlines.
Audit reports are a window to the financial health of a company and hence gauging coverage of various audit aspects in them is important. In this paper, we aim at determining an audit report’s coverage through classification of its sentences into multiple domain specific classes. In a weakly supervised setting, we employ a rule-based approach to automatically create training data for a BERT-based multi-label classifier. We then devise an ensemble to combine both the rule based and classifier approaches. Further, we employ two novel ways to improve the ensemble’s generalization: (i) through an active learning based approach and, (ii) through a LLM based review. We demonstrate that our proposed approaches outperform several baselines. We show utility of the proposed approaches to measure audit coverage on a large dataset of 2.8K audit reports.
Relation extraction (RE) is a crucial task in natural language processing (NLP) that aims to identify and classify relationships between entities mentioned in text. In the financial domain, relation extraction plays a vital role in extracting valuable information from financial documents, such as news articles, earnings reports, and company filings. This paper describes our solution to relation extraction on one such dataset REFinD. The dataset was released along with shared task as a part of the Fourth Workshop on Knowledge Discovery from Unstructured Data in Financial Services, co-located with SIGIR 2023. In this paper, we employed OpenAI models under the framework of in-context learning (ICL). We utilized two retrieval strategies to find top K relevant in-context learning demonstrations / examples from training data for a given test example. The first retrieval mechanism, we employed, is a learning-free dense retriever and the other system is a learning-based retriever. We were able to achieve 3rd rank overall. Our best F1-score is 0.718.
Assessing a company’s sustainable development goes beyond just financial metrics; the inclusion of environmental, social, and governance (ESG) factors is becoming increasingly vital. The ML-ESG shared task series seeks to pioneer discussions on news-driven ESG ratings, drawing inspiration from the MSCI ESG rating guidelines. In its second edition, ML-ESG-2 emphasizes impact type identification, offering datasets in four languages: Chinese, English, French, and Japanese. Of the 28 teams registered, 8 participated in the official evaluation. This paper presents a comprehensive overview of ML-ESG-2, detailing the dataset specifics and summarizing the performance outcomes of the participating teams.
The paper presents a concise summary of our work for the ML-ESG-2 shared task, exclusively on the Chinese and English datasets. ML-ESG-2 aims to ascertain the influence of news articles on corporations, specifically from an ESG perspective. To this end, we generally explored the capability of key information for impact identification and experimented with various techniques at different levels. For instance, we attempted to incorporate important information at the word level with TF-IDF, at the sentence level with TextRank, and at the document level with summarization. The final results reveal that the one with GPT-4 for summarisation yields the best predictions.
With the growing interest in Green Investing, Environmental, Social, and Governance (ESG) factors related to Institutions and financial entities has become extremely important for investors. While the classification of potential ESG factors is an important issue, identifying whether the factors positively or negatively impact the Institution is also a key aspect to consider while making evaluations for ESG scores. This paper presents our solution to identify ESG impact types in four languages (English, Chinese, Japanese, French) released as shared tasks during the FinNLP workshop at the IJCNLP-AACL-2023 conference. We use a combination of translation, masked language modeling, paraphrasing, and classification to solve this problem and use a generalized pipeline that performs well across all four languages. Our team ranked 1st in the Chinese and Japanese sub-tasks.
In this paper, we present our solutions to the ML-ESG-2 shared task which is co-located with the FinNLP workshop at IJCNLP-AACL-2023. The task proposes an objective of binary classification of ESG-related news based on what type of impact they can have on a company - Risk or Opportunity. We report the results of three systems, which ranked 2nd, 9th, and 10th in the final leaderboard for the English language, with the best solution achieving over 0.97 in F1 score.
This paper presents our findings in the ML-ESG-2 task, which focused on classifying a news snippet of various languages as “Risk” or “Opportunity” in the ESG (Environmental, Social, and Governance) context. We experimented with data augmentation and translation facilitated by Large Language Models (LLM). We found that augmenting the English dataset did not help to improve the performance. By fine-tuning RoBERTa models with the original data, we achieved the top position for the English and second place for the French task. In contrast, we could achieve comparable results on the French dataset by solely using the English translation, securing the third position for the French task with only marginal F1 differences to the second-place model.
In this paper, we describe our approach to the ML-ESG-2 shared task, co-located with the FinNLP workshop at IJCNLP-AACL-2023. The task aims at classifying news articles into categories reflecting either “Opportunity” or “Risk” from an ESG standpoint for companies. Our innovative methodology leverages two distinct systems for optimal text classification. In the initial phase, we engage in prompt engineering, working in conjunction with semantic similarity and using the Claude 2 LLM. Subsequently, we apply fine-tuning techniques to the Llama 2 and Dolly LLMs to enhance their performance. We report the results of five different approaches in this paper, with our top models ranking first in the French category and sixth in the English category.
In this paper, we discuss our (Team HHU’s) submission to the Multi-Lingual ESG Impact Type Identification task (ML-ESG-2). The goal of this task is to determine if an ESG-related news article represents an opportunity or a risk. We use an adapter-based framework in order to train multiple adapter modules which capture different parts of the knowledge present in the training data. Experimenting with various Adapter Fusion setups, we focus both on combining the ESG-aspect-specific knowledge, and on combining the language-specific-knowledge. Our results show that in both cases, it is possible to effectively compose the knowledge in order to improve the impact type determination.
In the evolving landscape of Environmental, Social, and Corporate Governance (ESG) impact assessment, the ML-ESG-2 shared task proposes identifying ESG impact types. To address this challenge, we present a comprehensive system leveraging ensemble learning techniques, capitalizing on early and late fusion approaches. Our approach employs four distinct models: mBERT, FlauBERT-base, ALBERT-base-v2, and a Multi-Layer Perceptron (MLP) incorporating Latent Semantic Analysis (LSA) and Term Frequency-Inverse Document Frequency (TF-IDF) features. Through extensive experimentation, we find that our early fusion ensemble approach, featuring the integration of LSA, TF-IDF, mBERT, FlauBERT-base, and ALBERT-base-v2, delivers the best performance. Our system offers a comprehensive ESG impact type identification solution, contributing to the responsible and sustainable decision-making processes vital in today’s financial and corporate governance landscape.