Rethinking Positional Encoding in Tree Transformer for Code Representation

Han Peng, Ge Li, Yunfei Zhao, Zhi Jin


Abstract
Transformers are now widely used in code representation, and several recent works further develop tree Transformers to capture the syntactic structure in source code. Specifically, novel tree positional encodings have been proposed to incorporate inductive bias into Transformer.In this work, we propose a novel tree Transformer encoding node positions based on our new description method for tree structures.Technically, local and global soft bias shown in previous works is both introduced as positional encodings of our Transformer model.Our model finally outperforms strong baselines on code summarization and completion tasks across two languages, demonstrating our model’s effectiveness.Besides, extensive experiments and ablation study shows that combining both local and global paradigms is still helpful in improving model performance. We release our code at https://github.com/AwdHanPeng/TreeTransformer.
Anthology ID:
2022.emnlp-main.210
Volume:
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates
Venue:
EMNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3204–3214
Language:
URL:
https://aclanthology.org/2022.emnlp-main.210
DOI:
Bibkey:
Cite (ACL):
Han Peng, Ge Li, Yunfei Zhao, and Zhi Jin. 2022. Rethinking Positional Encoding in Tree Transformer for Code Representation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 3204–3214, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):
Rethinking Positional Encoding in Tree Transformer for Code Representation (Peng et al., EMNLP 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.emnlp-main.210.pdf