Xuesong Lu


2022

pdf
Multi-task Learning for Paraphrase Generation With Keyword and Part-of-Speech Reconstruction
Xuhang Xie | Xuesong Lu | Bei Chen
Findings of the Association for Computational Linguistics: ACL 2022

Paraphrase generation using deep learning has been a research hotspot of natural language processing in the past few years. While previous studies tackle the problem from different aspects, the essence of paraphrase generation is to retain the key semantics of the source sentence and rewrite the rest of the content. Inspired by this observation, we propose a novel two-stage model, PGKPR, for paraphrase generation with keyword and part-of-speech reconstruction. The rationale is to capture simultaneously the possible keywords of a source sentence and the relations between them to facilitate the rewriting. In the first stage, we identify the possible keywords using a prediction attribution technique, where the words obtaining higher attribution scores are more likely to be the keywords. In the second stage, we train a transformer-based model via multi-task learning for paraphrase generation. The novel learning task is the reconstruction of the keywords and part-of-speech tags, respectively, from a perturbed sequence of the source sentence. The learned encodings are then decoded to generate the paraphrase. We conduct the experiments on two commonly-used datasets, and demonstrate the superior performance of PGKPR over comparative models on multiple evaluation metrics.

pdf
CAT-probing: A Metric-based Approach to Interpret How Pre-trained Models for Programming Language Attend Code Structure
Nuo Chen | Qiushi Sun | Renyu Zhu | Xiang Li | Xuesong Lu | Ming Gao
Findings of the Association for Computational Linguistics: EMNLP 2022

Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. We first denoise the input code sequences based on the token types pre-defined by the compilers to filter those tokens whose attention scores are too small. After that, we define a new metric CAT-score to measure the commonality between the token-level attention scores generated in CodePTMs and the pair-wise distances between corresponding AST nodes. The higher the CAT-score, the stronger the ability of CodePTMs to capture code structure. We conduct extensive experiments to integrate CAT-probing with representative CodePTMs for different programming languages. Experimental results show the effectiveness of CAT-probing in CodePTM interpretation. Our codes and data are publicly available at https://github.com/nchen909/CodeAttention.