Open-ended Long Text Generation via Masked Language Modeling

Xiaobo Liang, Zecheng Tang, Juntao Li, Min Zhang


Abstract
Pre-trained autoregressive (AR) language models such as BART and GPTs have dominated OPen-ended Long Text Generation (Open-LTG).However, the AR nature will decrease the inference efficiency along with the increase of generation length, which hinder their application in Open-LTG.To improve inference efficiency, we alternatively explore the potential of the pre-trained masked language models (MLMs) along with a representative iterative non-autoregressive (NAR) decoding strategy for Open-LTG.Our preliminary study shows that pre-trained MLMs can merely generate short text and will collapse for long text modeling. To enhance the long text generation capability of MLMs, we introduce two simple yet effective strategies for the iterative NAR model: dynamic sliding window attention (DSWA) and linear temperature decay (LTD). It can alleviate long-distance collapse problems and achieve longer text generation with a flexible trade-off between performance and inference speedup. Experiments on the storytelling and multi-paragraph opinionated article writing tasks show that pre-trained MLMs can achieve more than 3 × 13 × speedup with better performance than strong AR models.
Anthology ID:
2023.acl-long.13
Volume:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2023
Address:
Toronto, Canada
Editors:
Anna Rogers, Jordan Boyd-Graber, Naoaki Okazaki
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
223–241
Language:
URL:
https://aclanthology.org/2023.acl-long.13
DOI:
10.18653/v1/2023.acl-long.13
Bibkey:
Cite (ACL):
Xiaobo Liang, Zecheng Tang, Juntao Li, and Min Zhang. 2023. Open-ended Long Text Generation via Masked Language Modeling. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 223–241, Toronto, Canada. Association for Computational Linguistics.
Cite (Informal):
Open-ended Long Text Generation via Masked Language Modeling (Liang et al., ACL 2023)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2023.acl-long.13.pdf
Video:
 https://preview.aclanthology.org/naacl-24-ws-corrections/2023.acl-long.13.mp4