Sudheer Chava


2024

pdf
Saliency-Aware Interpolative Augmentation for Multimodal Financial Prediction
Samyak Jain | Parth Chhabra | Atula Tejaswi Neerkaje | Puneet Mathur | Ramit Sawhney | Shivam Agarwal | Preslav Nakov | Sudheer Chava | Dinesh Manocha
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

Predicting price variations of financial instruments for risk modeling and stock trading is challenging due to the stochastic nature of the stock market. While recent advancements in the Financial AI realm have expanded the scope of data and methods they use, such as textual and audio cues from financial earnings calls, limitations exist. Most datasets are small, and show domain distribution shifts due to the nature of their source, suggesting the exploration for data augmentation for robust augmentation strategies such as Mixup. To tackle such challenges in the financial domain, we propose SH-Mix: Saliency-guided Hierarchical Mixup augmentation technique for multimodal financial prediction tasks. SH-Mix combines multi-level embedding mixup strategies based on the contribution of each modality and context subsequences. Through extensive quantitative and qualitative experiments on financial earnings and conference call datasets consisting of text and speech, we show that SH-Mix outperforms state-of-the-art methods by 3-7%. Additionally, we show that SH-Mix is generalizable across different modalities and models.

2023

pdf
Trillion Dollar Words: A New Financial Dataset, Task & Market Analysis
Agam Shah | Suvan Paturi | Sudheer Chava
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Monetary policy pronouncements by Federal Open Market Committee (FOMC) are a major driver of financial market returns. We construct the largest tokenized and annotated dataset of FOMC speeches, meeting minutes, and press conference transcripts in order to understand how monetary policy influences financial markets. In this study, we develop a novel task of hawkish-dovish classification and benchmark various pre-trained language models on the proposed dataset. Using the best-performing model (RoBERTa-large), we construct a measure of monetary policy stance for the FOMC document release days. To evaluate the constructed measure, we study its impact on the treasury market, stock market, and macroeconomic indicators. Our dataset, models, and code are publicly available on Huggingface and GitHub under CC BY-NC 4.0 license.

2022

pdf
HYPHEN: Hyperbolic Hawkes Attention For Text Streams
Shivam Agarwal | Ramit Sawhney | Sanchit Ahuja | Ritesh Soun | Sudheer Chava
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Analyzing the temporal sequence of texts from sources such as social media, news, and parliamentary debates is a challenging problem as it exhibits time-varying scale-free properties and fine-grained timing irregularities. We propose a Hyperbolic Hawkes Attention Network (HYPHEN), which learns a data-driven hyperbolic space and models irregular powerlaw excitations using a hyperbolic Hawkes process. Through quantitative and exploratory experiments over financial NLP, suicide ideation detection, and political debate analysis we demonstrate HYPHEN’s practical applicability for modeling online text sequences in a geometry agnostic manner.

pdf
Tweet Based Reach Aware Temporal Attention Network for NFT Valuation
Ramit Sawhney | Megh Thakkar | Ritesh Soun | Atula Neerkaje | Vasu Sharma | Dipanwita Guhathakurta | Sudheer Chava
Findings of the Association for Computational Linguistics: EMNLP 2022

Non-Fungible Tokens (NFTs) are a relatively unexplored class of assets. Designing strategies to forecast NFT trends is an intricate task due to its extremely volatile nature. The market is largely driven by public sentiment and “hype”, which in turn has a high correlation with conversations taking place on social media platforms like Twitter. Prior work done for modelling stock market data does not take into account the extent of impact certain highly influential tweets and their authors can have on the market. Building on these limitations and the nature of the NFT market, we propose a novel reach-aware temporal learning approach to make predictions for forecasting future trends in the NFT market. We perform experiments on a new dataset consisting of over 1.3 million tweets and 180 thousand NFT transactions spanning over 15 NFT collections curated by us. Our model (TA-NFT) outperforms other state-of-the-art methods by an average of 36%. Through extensive quantitative and ablative analysis, we demonstrate the ability of our approach as a practical method for predicting NFT trends.

pdf
Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models
Ramit Sawhney | Shivam Agarwal | Vivek Mittal | Paolo Rosso | Vikram Nanda | Sudheer Chava
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

The rapid spread of information over social media influences quantitative trading and investments. The growing popularity of speculative trading of highly volatile assets such as cryptocurrencies and meme stocks presents a fresh challenge in the financial realm. Investigating such “bubbles” - periods of sudden anomalous behavior of markets are critical in better understanding investor behavior and market dynamics. However, high volatility coupled with massive volumes of chaotic social media texts, especially for underexplored assets like cryptocoins pose a challenge to existing methods. Taking the first step towards NLP for cryptocoins, we present and publicly release CryptoBubbles, a novel multi- span identification task for bubble detection, and a dataset of more than 400 cryptocoins from 9 exchanges over five years spanning over two million tweets. Further, we develop a set of sequence-to-sequence hyperbolic models suited to this multi-span identification task based on the power-law dynamics of cryptocurrencies and user behavior on social media. We further test the effectiveness of our models under zero-shot settings on a test set of Reddit posts pertaining to 29 “meme stocks”, which see an increase in trade volume due to social media hype. Through quantitative, qualitative, and zero-shot analyses on Reddit and Twitter spanning cryptocoins and meme-stocks, we show the practical applicability of CryptoBubbles and hyperbolic models.

pdf
When FLUE Meets FLANG: Benchmarks and Large Pretrained Language Model for Financial Domain
Raj Shah | Kunal Chawla | Dheeraj Eidnani | Agam Shah | Wendi Du | Sudheer Chava | Natraj Raman | Charese Smiley | Jiaao Chen | Diyi Yang
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

Pre-trained language models have shown impressive performance on a variety of tasks and domains. Previous research on financial language models usually employs a generic training scheme to train standard model architectures, without completely leveraging the richness of the financial data. We propose a novel domain specific Financial LANGuage model (FLANG) which uses financial keywords and phrases for better masking, together with span boundary objective and in-filing objective. Additionally, the evaluation benchmarks in the field have been limited. To this end, we contribute the Financial Language Understanding Evaluation (FLUE), an open-source comprehensive suite of benchmarks for the financial domain. These include new benchmarks across 5 NLP tasks in financial domain as well as common benchmarks used in the previous research. Experiments on these benchmarks suggest that our model outperforms those in prior literature on a variety of NLP tasks. Our models, code and benchmark data will be made publicly available on Github and Huggingface.