A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations

Pierre Colombo, Pablo Piantanida, Chloé Clavel


Abstract
Learning disentangled representations of textual data is essential for many natural language tasks such as fair classification, style transfer and sentence generation, among others. The existent dominant approaches in the context of text data either rely on training an adversary (discriminator) that aims at making attribute values difficult to be inferred from the latent code or rely on minimising variational bounds of the mutual information between latent code and the value attribute. However, the available methods suffer of the impossibility to provide a fine-grained control of the degree (or force) of disentanglement. In contrast to adversarial methods, which are remarkably simple, although the adversary seems to be performing perfectly well during the training phase, after it is completed a fair amount of information about the undesired attribute still remains. This paper introduces a novel variational upper bound to the mutual information between an attribute and the latent code of an encoder. Our bound aims at controlling the approximation error via the Renyi’s divergence, leading to both better disentangled representations and in particular, a precise control of the desirable degree of disentanglement than state-of-the-art methods proposed for textual data. Furthermore, it does not suffer from the degeneracy of other losses in multi-class scenarios. We show the superiority of this method on fair classification and on textual style transfer tasks. Additionally, we provide new insights illustrating various trade-offs in style transfer when attempting to learn disentangled representations and quality of the generated sentence.
Anthology ID:
2021.acl-long.511
Volume:
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)
Month:
August
Year:
2021
Address:
Online
Venues:
ACL | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
6539–6550
Language:
URL:
https://aclanthology.org/2021.acl-long.511
DOI:
10.18653/v1/2021.acl-long.511
Bibkey:
Cite (ACL):
Pierre Colombo, Pablo Piantanida, and Chloé Clavel. 2021. A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 6539–6550, Online. Association for Computational Linguistics.
Cite (Informal):
A Novel Estimator of Mutual Information for Learning to Disentangle Textual Representations (Colombo et al., ACL-IJCNLP 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/author-url/2021.acl-long.511.pdf
Optional supplementary material:
 2021.acl-long.511.OptionalSupplementaryMaterial.zip
Video:
 https://preview.aclanthology.org/author-url/2021.acl-long.511.mp4