Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels

Harris Chan; Jamie Kiros; William Chan

doi:10.18653/v1/2020.findings-emnlp.376

Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels

Abstract

A channel corresponds to a viewpoint or transformation of an underlying meaning. A pair of parallel sentences in English and French express the same underlying meaning, but through two separate channels corresponding to their languages. In this work, we present the Multichannel Generative Language Model (MGLM). MGLM is a generative joint distribution model over channels. MGLM marginalizes over all possible factorizations within and across all channels. MGLM endows flexible inference, including unconditional generation, conditional generation (where 1 channel is observed and other channels are generated), and partially observed generation (where incomplete observations are spread across all the channels). We experiment with the Multi30K dataset containing English, French, Czech, and German. We demonstrate experiments with unconditional, conditional, and partially conditional generation. We provide qualitative samples sampled unconditionally from the generative joint distribution. We also quantitatively analyze the quality-diversity trade-offs and find MGLM outperforms traditional bilingual discriminative models.

Anthology ID:: 2020.findings-emnlp.376
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2020
Month:: November
Year:: 2020
Address:: Online
Editors:: Trevor Cohn, Yulan He, Yang Liu
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4208–4220
Language:
URL:: https://aclanthology.org/2020.findings-emnlp.376
DOI:: 10.18653/v1/2020.findings-emnlp.376
Bibkey:
Cite (ACL):: Harris Chan, Jamie Kiros, and William Chan. 2020. Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 4208–4220, Online. Association for Computational Linguistics.
Cite (Informal):: Multichannel Generative Language Model: Learning All Possible Factorizations Within and Across Channels (Chan et al., Findings 2020)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/2020.findings-emnlp.376.pdf
Video:: https://slideslive.com/38940114
Data: Multi30K

PDF Search Video