Robustness Analysis of Grover for Machine-Generated News Detection

Rinaldo Gagiano, Maria Myung-Hee Kim, Xiuzhen Zhang, Jennifer Biggs


Abstract
Advancements in Natural Language Generation have raised concerns on its potential misuse for deep fake news. Grover is a model for both generation and detection of neural fake news. While its performance on automatically discriminating neural fake news surpassed GPT-2 and BERT, Grover could face a variety of adversarial attacks to deceive detection. In this work, we present an investigation of Grover’s susceptibility to adversarial attacks such as character-level and word-level perturbations. The experiment results show that even a singular character alteration can cause Grover to fail, affecting up to 97% of target articles with unlimited attack attempts, exposing a lack of robustness. We further analyse these misclassified cases to highlight affected words, identify vulnerability within Grover’s encoder, and perform a novel visualisation of cumulative classification scores to assist in interpreting model behaviour.
Anthology ID:
2021.alta-1.12
Volume:
Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association
Month:
December
Year:
2021
Address:
Online
Editors:
Afshin Rahimi, William Lane, Guido Zuccon
Venue:
ALTA
SIG:
Publisher:
Australasian Language Technology Association
Note:
Pages:
119–127
Language:
URL:
https://aclanthology.org/2021.alta-1.12
DOI:
Bibkey:
Cite (ACL):
Rinaldo Gagiano, Maria Myung-Hee Kim, Xiuzhen Zhang, and Jennifer Biggs. 2021. Robustness Analysis of Grover for Machine-Generated News Detection. In Proceedings of the 19th Annual Workshop of the Australasian Language Technology Association, pages 119–127, Online. Australasian Language Technology Association.
Cite (Informal):
Robustness Analysis of Grover for Machine-Generated News Detection (Gagiano et al., ALTA 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/add_acl24_videos/2021.alta-1.12.pdf
Data
RealNews