Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification
Konstantinos Skianis, Fragkiskos Malliaros, Michalis Vazirgiannis
Abstract
Contrary to the traditional Bag-of-Words approach, we consider the Graph-of-Words(GoW) model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, the importance of a term is determined by weighting the corresponding node in the document, collection and label graphs, using node centrality criteria. We also introduce novel graph-based weighting schemes by enriching graphs with word-embedding similarities, in order to reward or penalize semantic relationships. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria.- Anthology ID:
- W18-1707
- Volume:
- Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12)
- Month:
- June
- Year:
- 2018
- Address:
- New Orleans, Louisiana, USA
- Editors:
- Goran Glavaš, Swapna Somasundaran, Martin Riedl, Eduard Hovy
- Venue:
- TextGraphs
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 49–58
- Language:
- URL:
- https://aclanthology.org/W18-1707
- DOI:
- 10.18653/v1/W18-1707
- Cite (ACL):
- Konstantinos Skianis, Fragkiskos Malliaros, and Michalis Vazirgiannis. 2018. Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification. In Proceedings of the Twelfth Workshop on Graph-Based Methods for Natural Language Processing (TextGraphs-12), pages 49–58, New Orleans, Louisiana, USA. Association for Computational Linguistics.
- Cite (Informal):
- Fusing Document, Collection and Label Graph-based Representations with Word Embeddings for Text Classification (Skianis et al., TextGraphs 2018)
- PDF:
- https://preview.aclanthology.org/bionlp-24-ingestion/W18-1707.pdf
- Code
- y3nk0/Graph-Based-TC