GEM: Generative Enhanced Model for adversarial attacks

Piotr Niewinski, Maria Pszona, Maria Janicka


Abstract
We present our Generative Enhanced Model (GEM) that we used to create samples awarded the first prize on the FEVER 2.0 Breakers Task. GEM is the extended language model developed upon GPT-2 architecture. The addition of novel target vocabulary input to the already existing context input enabled controlled text generation. The training procedure resulted in creating a model that inherited the knowledge of pretrained GPT-2, and therefore was ready to generate natural-like English sentences in the task domain with some additional control. As a result, GEM generated malicious claims that mixed facts from various articles, so it became difficult to classify their truthfulness.
Anthology ID:
D19-6604
Volume:
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos, Arpit Mittal
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
20–26
Language:
URL:
https://preview.aclanthology.org/add-orcids-2023-acl/D19-6604/
DOI:
10.18653/v1/D19-6604
Bibkey:
Cite (ACL):
Piotr Niewinski, Maria Pszona, and Maria Janicka. 2019. GEM: Generative Enhanced Model for adversarial attacks. In Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER), pages 20–26, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
GEM: Generative Enhanced Model for adversarial attacks (Niewinski et al., 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/add-orcids-2023-acl/D19-6604.pdf
Attachment:
 D19-6604.Attachment.pdf