Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding
Chaodong Tong, Huailiang Peng, Qiong Dai, Lei Jiang, Jianghua Huang
Abstract
We propose a method called reverse mapping bytepair encoding, which maps named-entity information and other word-level linguistic features back to subwords during the encoding procedure of bytepair encoding (BPE). We employ this method to the Generative Pre-trained Transformer (OpenAI GPT) by adding a weighted linear layer after the embedding layer. We also propose a new model architecture named as the multi-channel separate transformer to employ a training process without parameter-sharing. Evaluation on Stories Cloze, RTE, SciTail and SST-2 datasets demonstrates the effectiveness of our approach.- Anthology ID:
- K19-1016
- Volume:
- Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Venue:
- CoNLL
- SIG:
- SIGNLL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 163–173
- Language:
- URL:
- https://aclanthology.org/K19-1016
- DOI:
- 10.18653/v1/K19-1016
- Cite (ACL):
- Chaodong Tong, Huailiang Peng, Qiong Dai, Lei Jiang, and Jianghua Huang. 2019. Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), pages 163–173, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Improving Natural Language Understanding by Reverse Mapping Bytepair Encoding (Tong et al., CoNLL 2019)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/K19-1016.pdf