System Report for CCL23-Eval Task 6: A Method For Telecom Network Fraud Case Classification Based on Two-stage Training Framework and Within-task Pretraining

Guangyu Zheng, Tingting He, Zhenyu Wang, Haochang Wang


Abstract
“Domain-specific text classification often needs more external knowledge, and fraud cases havefewer descriptions. Existing methods usually utilize single-stage deep models to extract semanticfeatures, which is less reusable. To tackle this issue, we propose a two-stage training frameworkbased on within-task pretraining and multi-dimensional semantic enhancement for CCL23-EvalTask 6 (Telecom Network Fraud Case Classification, FCC). Our training framework is dividedinto two stages. First, we pre-train using the training corpus to obtain specific BERT. The seman-tic mining ability of the model is enhanced from the feature space perspective by introducing ad-versarial training and multiple random sampling. The pseudo-labeled data is generated throughthe test data above a certain threshold. Second, pseudo-labeled samples are added to the trainingset for semantic enhancement based on the sample space dimension. We utilize the same back-bone for prediction to obtain the results. Experimental results show that our proposed methodoutperforms the single-stage benchmarks and achieves competitive performance with 0.859259F1. It also performs better in the few-shot patent classification task with 65.160% F1, whichindicates robustness.”
Anthology ID:
2023.ccl-3.23
Volume:
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
Month:
August
Year:
2023
Address:
Harbin, China
Editors:
Maosong Sun, Bing Qin, Xipeng Qiu, Jing Jiang, Xianpei Han
Venue:
CCL
SIG:
Publisher:
Chinese Information Processing Society of China
Note:
Pages:
206–212
Language:
English
URL:
https://aclanthology.org/2023.ccl-3.23
DOI:
Bibkey:
Cite (ACL):
Guangyu Zheng, Tingting He, Zhenyu Wang, and Haochang Wang. 2023. System Report for CCL23-Eval Task 6: A Method For Telecom Network Fraud Case Classification Based on Two-stage Training Framework and Within-task Pretraining. In Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations), pages 206–212, Harbin, China. Chinese Information Processing Society of China.
Cite (Informal):
System Report for CCL23-Eval Task 6: A Method For Telecom Network Fraud Case Classification Based on Two-stage Training Framework and Within-task Pretraining (Zheng et al., CCL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-4/2023.ccl-3.23.pdf