Comparison of Machine Learning Approaches for Industry Classification Based on Textual Descriptions of Companies

Andrey Tagarev, Nikola Tulechki, Svetla Boytcheva


Abstract
This paper addresses the task of categorizing companies within industry classification schemes. The datasets consists of encyclopedic articles about companies and their economic activities. The target classification schema is build by mapping linked open data in a semi-supervised manner. Target classes are build bottom-up from DBpedia. We apply several state of the art text classification techniques, based both on deep-learning and classical vector-space models.
Anthology ID:
R19-1134
Volume:
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
Month:
September
Year:
2019
Address:
Varna, Bulgaria
Editors:
Ruslan Mitkov, Galia Angelova
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
1169–1175
Language:
URL:
https://preview.aclanthology.org/build-pipeline-with-new-library/R19-1134/
DOI:
10.26615/978-954-452-056-4_134
Bibkey:
Cite (ACL):
Andrey Tagarev, Nikola Tulechki, and Svetla Boytcheva. 2019. Comparison of Machine Learning Approaches for Industry Classification Based on Textual Descriptions of Companies. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 1169–1175, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Comparison of Machine Learning Approaches for Industry Classification Based on Textual Descriptions of Companies (Tagarev et al., RANLP 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/build-pipeline-with-new-library/R19-1134.pdf