JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation

Xiaobo Liang; Lijun Wu; Juntao Li; Min Zhang (张民)

doi:10.18653/v1/2022.emnlp-main.550

JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation

Xiaobo Liang, Lijun Wu, Juntao Li, Min Zhang

Abstract

Transformer-based autoregressive and non-autoregressive models have played an essential role in sequence generation tasks. The autoregressive model can obtain excellent performance, while the non-autoregressive model brings fast decoding speed for inference. In this paper, we propose JANUS, a Joint Autoregressive and Non-autoregressive training method using aUxiliary losS to enhance the model performance in both AR and NAR manner simultaneously and effectively alleviate the problem of distribution discrepancy.Further, we pre-train BART with JANUS on a large corpus with minimal cost (16 GPU days) and make the BART-JANUS capable of non-autoregressive generation, demonstrating that our approach can transfer the AR knowledge to NAR. Empirically, we show our approach and BART-JANUS can achieve significant improvement on multiple generation tasks, including machine translation and GLGE benchmarks. Our code is available at Github.

Anthology ID:: 2022.emnlp-main.550
Volume:: Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
Month:: December
Year:: 2022
Address:: Abu Dhabi, United Arab Emirates
Editors:: Yoav Goldberg, Zornitsa Kozareva, Yue Zhang
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 8050–8060
Language:
URL:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2022.emnlp-main.550/
DOI:: 10.18653/v1/2022.emnlp-main.550
Bibkey:
Cite (ACL):: Xiaobo Liang, Lijun Wu, Juntao Li, and Min Zhang. 2022. JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 8050–8060, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics.
Cite (Informal):: JANUS: Joint Autoregressive and Non-autoregressive Training with Auxiliary Loss for Sequence Generation (Liang et al., EMNLP 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/jlcl-multiple-ingestion/2022.emnlp-main.550.pdf

PDF Cite Search Fix data