Combining Word Vector Technique and Clustering Algorithm for Credit Card Merchant Detection
Fang-Ju Lee | Ying-Chun Lo | Jheng-Long Wu
Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022)
Extracting relevant user behaviors through customer’s transaction description is one of the ways to collect customer information. In the current text mining field, most of the researches are mainly study text classification, and only few study text clusters. Find the relationship between letters and words in the unstructured transaction consumption description. Use Word Embedding and text mining technology to break through the limitation of classification conditions that need to be distinguished in advance, establish automatic identification and analysis methods, and improve the accuracy of grouping. In this study, use Jieba to segment Chinese words, were based on the content of credit card transaction description. Feature extractions of Word2Vec, combined with Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Hierarchical Agglomerative Clustering, cross-combination experiments. The prediction results of MUC, B3 and CEAF’s F1 average of 67.58% are more significant.