META DATA
---------
Model Name: BERT
Attack Target: first_token
Gradient Model File: ../sst_attack_models/experiment_30/model_iter200_epoch1.th
Predictive Model File: ../sst_regularized_models/anonymn/BERT_low_grad_high_acc_SST_ep0_7800.th
Baseline Model File: ../sst_baseline_models/anonymn/BERT_matched_accfixed3_SA.th
Cuda: True

Gradient Combined
-----------------
mean_reciprocal_rank: 0.998
hit_rate_1: 0.997
mean_grad_attribution: 0.911

Gradient Regularized
--------------------
mean_reciprocal_rank: 0.194
hit_rate_1: 0.076
mean_grad_attribution: 0.054

Gradient Baseline
-----------------
mean_reciprocal_rank: 0.219
hit_rate_1: 0.083
mean_grad_attribution: 0.062

Gradient Evil Twin
------------------
mean_reciprocal_rank: 1.000
hit_rate_1: 1.000
mean_grad_attribution: 0.993

Gradient Simple Combined
------------------------
mean_reciprocal_rank: 0.998
hit_rate_1: 0.995
mean_grad_attribution: 0.678

################################################

SmoothGrad Combined
-------------------
mean_reciprocal_rank: 0.994
hit_rate_1: 0.989
mean_grad_attribution: 0.870

SmoothGrad Regularized
----------------------
mean_reciprocal_rank: 0.186
hit_rate_1: 0.062
mean_grad_attribution: 0.053

SmoothGrad Baseline
-------------------
mean_reciprocal_rank: 0.215
hit_rate_1: 0.079
mean_grad_attribution: 0.060

SmoothGrad Evil Twin
--------------------
mean_reciprocal_rank: 1.000
hit_rate_1: 1.000
mean_grad_attribution: 0.993

SmoothGrad Simple Combined
--------------------------
mean_reciprocal_rank: 0.991
hit_rate_1: 0.983
mean_grad_attribution: 0.589

##################################################

InteGrad Combined
-----------------
mean_reciprocal_rank: 0.618
hit_rate_1: 0.478
mean_grad_attribution: 0.298

InteGrad Regularized
--------------------
mean_reciprocal_rank: 0.132
hit_rate_1: 0.028
mean_grad_attribution: 0.033

InteGrad Baseline
-----------------
mean_reciprocal_rank: 0.144
hit_rate_1: 0.022
mean_grad_attribution: 0.038

InteGrad Evil Twin
------------------
mean_reciprocal_rank: 1.000
hit_rate_1: 1.000
mean_grad_attribution: 0.982

InteGrad Simple Combined
------------------------
mean_reciprocal_rank: 0.165
hit_rate_1: 0.028
mean_grad_attribution: 0.042

Model Accuracies
------------------
Combined Model Acc: 0.925
Regularized Model Acc: 0.928
Baseline Model Acc: 0.927
Evil Twin Model Acc: 0.485
Simple Combined Model Acc: 0.928
