FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts

Sarthak Gupta

FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts

Abstract

This paper describes our system developed for the sub-task 1 of the FinCausal shared task in the FNP-FNS workshop held in conjunction with COLING-2020. The system classifies whether a financial news text segment contains causality or not. To address this task, we fine-tune and ensemble the generic and domain-specific BERT language models pre-trained on financial text corpora. The task data is highly imbalanced with the majority non-causal class; therefore, we train the models using strategies such as under-sampling, cost-sensitive learning, and data augmentation. Our best system achieves a weighted F1-score of 96.98 securing 4th position on the evaluation leaderboard. The code is available at https://github.com/sarthakTUM/fincausal

Anthology ID:: 2020.fnp-1.12
Volume:: Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation
Month:: December
Year:: 2020
Address:: Barcelona, Spain (Online)
Venues:: COLING | FNP
SIG:
Publisher:: COLING
Note:
Pages:: 74–79
Language:
URL:: https://aclanthology.org/2020.fnp-1.12
DOI:
Bibkey:
Cite (ACL):: Sarthak Gupta. 2020. FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts. In Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation, pages 74–79, Barcelona, Spain (Online). COLING.
Cite (Informal):: FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts (Gupta, FNP 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/update-css-js/2020.fnp-1.12.pdf
Code: sarthaktum/fincausal
Data: SemEval-2010 Task 8

PDF Cite Search Code