Abstract
Linguistic steganography studies how to hide secret messages in natural language cover texts. Traditional methods aim to transform a secret message into an innocent text via lexical substitution or syntactical modification. Recently, advances in neural language models (LMs) enable us to directly generate cover text conditioned on the secret message. In this study, we present a new linguistic steganography method which encodes secret messages using self-adjusting arithmetic coding based on a neural language model. We formally analyze the statistical imperceptibility of this method and empirically show it outperforms the previous state-of-the-art methods on four datasets by 15.3% and 38.9% in terms of bits/word and KL metrics, respectively. Finally, human evaluations show that 51% of generated cover texts can indeed fool eavesdroppers.- Anthology ID:
- 2020.emnlp-main.22
- Volume:
- Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)
- Month:
- November
- Year:
- 2020
- Address:
- Online
- Editors:
- Bonnie Webber, Trevor Cohn, Yulan He, Yang Liu
- Venue:
- EMNLP
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 303–313
- Language:
- URL:
- https://aclanthology.org/2020.emnlp-main.22
- DOI:
- 10.18653/v1/2020.emnlp-main.22
- Cite (ACL):
- Jiaming Shen, Heng Ji, and Jiawei Han. 2020. Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 303–313, Online. Association for Computational Linguistics.
- Cite (Informal):
- Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding (Shen et al., EMNLP 2020)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2020.emnlp-main.22.pdf
- Code
- mickeystroller/StegaText