Arshiya Aggarwal


2021

pdf
An Empirical Investigation of Bias in the Multimodal Analysis of Financial Earnings Calls
Ramit Sawhney | Arshiya Aggarwal | Rajiv Ratn Shah
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

Volatility prediction is complex due to the stock market’s stochastic nature. Existing research focuses on the textual elements of financial disclosures like earnings calls transcripts to forecast stock volatility and risk, but ignores the rich acoustic features in the company executives’ speech. Recently, new multimodal approaches that leverage the verbal and vocal cues of speakers in financial disclosures significantly outperform previous state-of-the-art approaches demonstrating the benefits of multimodality and speech. However, the financial realm is still plagued with a severe underrepresentation of various communities spanning diverse demographics, gender, and native speech. While multimodal models are better risk forecasters, it is imperative to also investigate the potential bias that these models may learn from the speech signals of company executives. In this work, we present the first study to discover the gender bias in multimodal volatility prediction due to gender-sensitive audio features and fewer female executives in earnings calls of one of the world’s biggest stock indexes, the S&P 500 index. We quantitatively analyze bias as error disparity and investigate the sources of this bias. Our results suggest that multimodal neural financial models accentuate gender-based stereotypes.

2020

pdf
VolTAGE: Volatility Forecasting via Text Audio Fusion with Graph Convolution Networks for Earnings Calls
Ramit Sawhney | Piyush Khanna | Arshiya Aggarwal | Taru Jain | Puneet Mathur | Rajiv Ratn Shah
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Natural language processing has recently made stock movement forecasting and volatility forecasting advances, leading to improved financial forecasting. Transcripts of companies’ earnings calls are well studied for risk modeling, offering unique investment insight into stock performance. However, vocal cues in the speech of company executives present an underexplored rich source of natural language data for estimating financial risk. Additionally, most existing approaches ignore the correlations between stocks. Building on existing work, we introduce a neural model for stock volatility prediction that accounts for stock interdependence via graph convolutions while fusing verbal, vocal, and financial features in a semi-supervised multi-task risk forecasting formulation. Our proposed model, VolTAGE, outperforms existing methods demonstrating the effectiveness of multimodal learning for volatility prediction.