Wenpeng Lu


2020

pdf bib
Intra-Correlation Encoding for Chinese Sentence Intention Matching
Xu Zhang | Yifeng Li | Wenpeng Lu | Ping Jian | Guoqiang Zhang
Proceedings of the 28th International Conference on Computational Linguistics

Sentence intention matching is vital for natural language understanding. Especially for Chinese sentence intention matching task, due to the ambiguity of Chinese words, semantic missing or semantic confusion are more likely to occur in the encoding process. Although the existing methods have enriched text representation through pre-trained word embedding to solve this problem, due to the particularity of Chinese text, different granularities of pre-trained word embedding will affect the semantic description of a piece of text. In this paper, we propose an effective approach that combines character-granularity and word-granularity features to perform sentence intention matching, and we utilize soft alignment attention to enhance the local information of sentences on the corresponding levels. The proposed method can capture sentence feature information from multiple perspectives and correlation information between different levels of sentences. By evaluating on BQ and LCQMC datasets, our model has achieved remarkable results, and demonstrates better or comparable performance with BERT-based models.

2017

pdf bib
QLUT at SemEval-2017 Task 1: Semantic Textual Similarity Based on Word Embeddings
Fanqing Meng | Wenpeng Lu | Yuteng Zhang | Jinyong Cheng | Yuehan Du | Shuwang Han
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper reports the details of our submissions in the task 1 of SemEval 2017. This task aims at assessing the semantic textual similarity of two sentences or texts. We submit three unsupervised systems based on word embeddings. The differences between these runs are the various preprocessing on evaluation data. The best performance of these systems on the evaluation of Pearson correlation is 0.6887. Unsurprisingly, results of our runs demonstrate that data preprocessing, such as tokenization, lemmatization, extraction of content words and removing stop words, is helpful and plays a significant role in improving the performance of models.

pdf bib
QLUT at SemEval-2017 Task 2: Word Similarity Based on Word Embedding and Knowledge Base
Fanqing Meng | Wenpeng Lu | Yuteng Zhang | Ping Jian | Shumin Shi | Heyan Huang
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

This paper shows the details of our system submissions in the task 2 of SemEval 2017. We take part in the subtask 1 of this task, which is an English monolingual subtask. This task is designed to evaluate the semantic word similarity of two linguistic items. The results of runs are assessed by standard Pearson and Spearman correlation, contrast with official gold standard set. The best performance of our runs is 0.781 (Final). The techniques of our runs mainly make use of the word embeddings and the knowledge-based method. The results demonstrate that the combined method is effective for the computation of word similarity, while the word embeddings and the knowledge-based technique, respectively, needs more deeply improvement in details.

2016

pdf bib
BIT at SemEval-2016 Task 1: Sentence Similarity Based on Alignments and Vector with the Weight of Information Content
Hao Wu | Heyan Huang | Wenpeng Lu
Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016)