2024
pdf
abs
Rich Semantic Knowledge Enhanced Large Language Models for Few-shot Chinese Spell Checking
Ming Dong
|
Yujing Chen
|
Miao Zhang
|
Hao Sun
|
Tingting He
Findings of the Association for Computational Linguistics ACL 2024
Chinese Spell Checking (CSC) is a widely used technology, which plays a vital role in speech to text (STT) and optical character recognition (OCR). Most of the existing CSC approaches relying on BERT architecture achieve excellent performance. However, limited by the scale of the foundation model, BERT-based method does not work well in few-shot scenarios, showing certain limitations in practical applications. In this paper, we explore using an in-context learning method named RS-LLM (Rich\ Semantic\ based\ LLMs\) to introduce large language models (LLMs) as the foundation model. Besides, we study the impact of introducing various Chinese rich semantic information in our framework. We found that by introducing a small number of specific Chinese rich semantic structures, LLMs achieve better performance than most of the BERT-based model on few-shot CSC task. Furthermore, we conduct experiments on multiple datasets, and the experimental results verified the superiority of our proposed framework.
2023
pdf
abs
DSPM-NLG: A Dual Supervised Pre-trained Model for Few-shot Natural Language Generation in Task-oriented Dialogue System
Yufan Wang
|
Bowei Zou
|
Rui Fan
|
Ai Ti Aw
|
Tingting He
Findings of the Association for Computational Linguistics: ACL 2023
In few-shot settings, fully conveying the semantic information of the dialogue act is a crucial challenge for Natural Language Generation (NLG) in the task-oriented dialogue system. An interesting fact is that NLG and Spoken Language Understanding (SLU) are a natural dual problem pair. Suppose the response generated by the NLG module can be restored to the corresponding dialogue act by the SLU module, which reflects that the generated response fully conveys the semantic information of the dialogue act. Based on this idea, a novel Dual Supervised Pre-trained Model for a few-shot Natural Language Generation (DSPM-NLG) is proposed to regularize the pre-training process. We adopt a joint model with a dual supervised framework to learn the dual correlation between NLG and SLU from the perspective of probability. In addition, a slot-masked strategy is designed to enable the model to focus better on the key slot-value pairs. DSPM-NLG is continuously trained on existing public large-scale annotated data, which thoroughly learns the duality between two tasks to enhance the semantically controlling and generalization abilities of the pre-trained model. Experiments demonstrate that our proposed model performs outstandingly on the few-shot benchmark dataset and outperforms the previous SOTA results.
pdf
abs
Making Pre-trained Language Models Better Learn Few-Shot Spoken Language Understanding in More Practical Scenarios
Yufan Wang
|
Jie Mei
|
Bowei Zou
|
Rui Fan
|
Tingting He
|
Ai Ti Aw
Findings of the Association for Computational Linguistics: ACL 2023
Most previous few-shot Spoken Language Understanding (SLU) models typically need to be trained on a set of data-rich source domains and adapt to the target domain with a few examples. In this paper, we explore a more practical scenario for few-shot SLU, in which we only assume access to a pre-trained language model and a few labeled examples without any other source domain data. We concentrate on understanding how far the few-shot SLU could be pushed in this setting. To this end, we develop a prompt-based intent detection model in few-shot settings, which leverages the BERT original pre-training next sentence prediction task and the prompt template to detect the user’s intent. For slot filling, we propose an approach of reconstructing slot labels, which reduces the training complexity by reducing the number of slot labels in few-shot settings. To evaluate the few-shot SLU for a more practical scenario, we present two benchmarks, FewShotATIS and FewShotSNIPS. And a dynamic sampling strategy is designed to construct the two datasets according to the learning difficulty of each intent and slot. Experiments on FewShotATIS and FewShotSNIPS demonstrate that our proposed model achieves state-of-the-art performance.
pdf
abs
System Report for CCL23-Eval Task 6: A Method For Telecom Network Fraud Case Classification Based on Two-stage Training Framework and Within-task Pretraining
Guangyu Zheng
|
Tingting He
|
Zhenyu Wang
|
Haochang Wang
Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 3: Evaluations)
“Domain-specific text classification often needs more external knowledge, and fraud cases havefewer descriptions. Existing methods usually utilize single-stage deep models to extract semanticfeatures, which is less reusable. To tackle this issue, we propose a two-stage training frameworkbased on within-task pretraining and multi-dimensional semantic enhancement for CCL23-EvalTask 6 (Telecom Network Fraud Case Classification, FCC). Our training framework is dividedinto two stages. First, we pre-train using the training corpus to obtain specific BERT. The seman-tic mining ability of the model is enhanced from the feature space perspective by introducing ad-versarial training and multiple random sampling. The pseudo-labeled data is generated throughthe test data above a certain threshold. Second, pseudo-labeled samples are added to the trainingset for semantic enhancement based on the sample space dimension. We utilize the same back-bone for prediction to obtain the results. Experimental results show that our proposed methodoutperforms the single-stage benchmarks and achieves competitive performance with 0.859259F1. It also performs better in the few-shot patent classification task with 65.160% F1, whichindicates robustness.”
2022
pdf
abs
DRLK: Dynamic Hierarchical Reasoning with Language Model and Knowledge Graph for Question Answering
Miao Zhang
|
Rufeng Dai
|
Ming Dong
|
Tingting He
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
In recent years, Graph Neural Network (GNN) approaches with enhanced knowledge graphs (KG) perform well in question answering (QA) tasks. One critical challenge is how to effectively utilize interactions between the QA context and KG. However, existing work only adopts the identical QA context representation to interact with multiple layers of KG, which results in a restricted interaction. In this paper, we propose DRLK (Dynamic Hierarchical Reasoning with Language Model and Knowledge Graphs), a novel model that utilizes dynamic hierarchical interactions between the QA context and KG for reasoning. DRLK extracts dynamic hierarchical features in the QA context, and performs inter-layer and intra-layer interactions on each iteration, allowing the KG representation to be grounded with the hierarchical features of the QA context. We conduct extensive experiments on four benchmark datasets in medical QA and commonsense reasoning. The experimental results demonstrate that DRLK achieves state-of-the-art performances on two benchmark datasets and performs competitively on the others.
2016
pdf
Bi-Transferring Deep Neural Networks for Domain Adaptation
Guangyou Zhou
|
Zhiwen Xie
|
Jimmy Xiangji Huang
|
Tingting He
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
2015
pdf
Learning Continuous Word Embedding with Metadata for Question Retrieval in Community Question Answering
Guangyou Zhou
|
Tingting He
|
Jun Zhao
|
Po Hu
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
2006
pdf
Discovering Relations among Named Entities by Detecting Community Structure
Tingting He
|
Junzhe Zhao
|
Jing Li
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
pdf
An Approach to Automatically Constructing Domain Ontology
Tingting He
|
Xiaopeng Zhang
|
Xinghuo Ye
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
pdf
Re-ranking Method Based on Topic Word Pairs
Tingting He
|
Ting Xu
|
Guozhong Qu
|
Xinhui Tu
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation
2005
pdf
The Standard of Chinese Corpus Metadata
Tingting He
|
Xiaoqi Xu
Proceedings of the Fifth Workshop on Asian Language Resources (ALR-05) and First Symposium on Asian Language Resources Network (ALRN)
pdf
An Unsupervised Approach to Chinese Word Sense Disambiguation Based on Hownet
Hao Chen
|
Tingting He
|
Donghong Ji
|
Changqin Quan
International Journal of Computational Linguistics & Chinese Language Processing, Volume 10, Number 4, December 2005: Special Issue on Selected Papers from CLSW-5
2004
pdf
Chinese Text Summarization Based on Thematic Area Detection
Po Hu
|
Tingting He
|
Donghong Ji
Text Summarization Branches Out
2003
pdf
A Vector-Based Algorithm for Chinese Text Classification
Changri Luo
|
Tingting He
Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation
pdf
Extracting Chinese Multi-Word Units from Large-Scale Balanced Corpus
Jianzhou Liu
|
Tingting He
|
Xiaohua Liu
Proceedings of the 17th Pacific Asia Conference on Language, Information and Computation