Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments

Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, Hsin-Hsi Chen


Abstract
In this paper, we attempt to answer the question of whether neural network models can learn numeracy, which is the ability to predict the magnitude of a numeral at some specific position in a text description. A large benchmark dataset, called Numeracy-600K, is provided for the novel task. We explore several neural network models including CNN, GRU, BiGRU, CRNN, CNN-capsule, GRU-capsule, and BiGRU-capsule in the experiments. The results show that the BiGRU model gets the best micro-averaged F1 score of 80.16%, and the GRU-capsule model gets the best macro-averaged F1 score of 64.71%. Besides discussing the challenges through comprehensive experiments, we also present an important application scenario, i.e., detecting exaggerated information, for the task.
Anthology ID:
P19-1635
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6307–6313
Language:
URL:
https://aclanthology.org/P19-1635
DOI:
10.18653/v1/P19-1635
Bibkey:
Cite (ACL):
Chung-Chi Chen, Hen-Hsen Huang, Hiroya Takamura, and Hsin-Hsi Chen. 2019. Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 6307–6313, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Numeracy-600K: Learning Numeracy for Detecting Exaggerated Information in Market Comments (Chen et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/P19-1635.pdf
Supplementary:
 P19-1635.Supplementary.pdf
Code
 aistairc/Numeracy-600K