META DATA
---------
Model Name: BERT
Attack Target: stop_token
Gradient Model File: ../sst_attack_models/experiment_29/model_iter400_epoch0.th
Predictive Model File: ../sst_regularized_models/anonymn/BERT_low_grad_high_acc_SST_ep0_7800.th
Baseline Model File: ../sst_baseline_models/anonymn/BERT_matched_accfixed3_SA.th
Cuda: True

Gradient Combined 
------------------------
mean_reciprocal_rank: 0.984
hit_rate_1: 0.978
mean_grad_attribution: 0.924

Gradient Simple Combined
------------------------
mean_reciprocal_rank: 0.980
hit_rate_1: 0.972
mean_grad_attribution: 0.781

Gradient Regularized
---------------------------
mean_reciprocal_rank: 0.325
hit_rate_1: 0.114
mean_grad_attribution: 0.227

Gradient Baseline
------------------------
mean_reciprocal_rank: 0.351
hit_rate_1: 0.139
mean_grad_attribution: 0.242

Gradient Evil Twin
-------------------------
mean_reciprocal_rank: 0.990
hit_rate_1: 0.989
mean_grad_attribution: 0.977

###################################################

SmoothGrad Combined 
------------------------
mean_reciprocal_rank: 0.977
hit_rate_1: 0.966
mean_grad_attribution: 0.901

SmoothGrad Simple Combined
--------------------------
mean_reciprocal_rank: 0.970
hit_rate_1: 0.955
mean_grad_attribution: 0.727

SmoothGrad Regularized
---------------------------
mean_reciprocal_rank: 0.322
hit_rate_1: 0.119
mean_grad_attribution: 0.219

SmoothGrad Baseline
------------------------
mean_reciprocal_rank: 0.340
hit_rate_1: 0.125
mean_grad_attribution: 0.232

SmoothGrad Evil Twin
-------------------------
mean_reciprocal_rank: 0.990
hit_rate_1: 0.987
mean_grad_attribution: 0.977

###################################################

InteGrad Combined 
----------------------------
mean_reciprocal_rank: 0.650
hit_rate_1: 0.467
mean_grad_attribution: 0.440

InteGrad Simple Combined
------------------------
mean_reciprocal_rank: 0.356
hit_rate_1: 0.100
mean_grad_attribution: 0.218

InteGrad Regularized
-------------------------------
mean_reciprocal_rank: 0.350
hit_rate_1: 0.133
mean_grad_attribution: 0.210

InteGrad Baseline
---------------------------
mean_reciprocal_rank: 0.353
hit_rate_1: 0.100
mean_grad_attribution: 0.214

InteGrad Evil Twin
-----------------------------
mean_reciprocal_rank: 0.989
hit_rate_1: 0.987
mean_grad_attribution: 0.934

MODEL ACCURACIES
------------------
Combined Model Acc: 0.927
Simple Combined Model Acc: 0.922
Regularized Model Acc: 0.928
Baseline Model Acc: 0.927
Evil Twin Model Acc: 0.569

