Tokenizer: bert-base-cased Model: bert-base-cased
	Train size: 80 Test size: 20


		-------------RUN 1-----------
			------------EPOCH 1---------------
Loss:  tensor(1655.1002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2454.9053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3506.6826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1998.1255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2007.1885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2080.4868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1741.9049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3499.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2600.0493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3295.3247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.2913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1703.3092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2475.2656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1616.5269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2184.1167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3219.2346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2159.8496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1957.5552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2756.3696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1564.7731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1205.0496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1342.4933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.7725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1316.4739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1419.2170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2142.8037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2026.5255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1665.0905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2268.8682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1646.1093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1964.3457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1174.1498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2291.9478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1638.0674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1376.5437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1643.2905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2029.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1806.0315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3756.1365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2843.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1548.3933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1712.3990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.4656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1761.3977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2423.4102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2594.0215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1239.9241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1631.2284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2629.6404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1890.1653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.6155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1264.6250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2410.7219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2297.6067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2244.1587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1061.6178, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.17014925373134326
Sentence level Krippendorff's alpha for Premises:  0.22686567164179106
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 310
	Prediction sentences having premises: 151
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 72
	Prediction Sentence having neither claim nor premise: 281
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 392
	Sentences having claim in only one of reference or prediction: 278
	Sentences having premise in both reference and prediction: 411
	Sentences having premise in only one of reference or prediction: 259
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1176.3922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1744.4167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2572.0933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1286.1160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1359.3909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1470.7097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1280.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2818.5935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2151.5334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2792.0051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1092.2654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1371.1252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1898.7256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.4678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1664.8772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2634.9316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.7272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1561.0823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2054.9829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1224.2029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.0416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1070.9863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.0034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1101.8184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.4587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1739.7446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1505.0687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1309.3724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1757.3740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1287.6583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1564.6472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.3831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2039.3142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1359.2491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1210.7708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1392.4784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.6101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1594.7020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1525.2557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3311.1528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2470.2734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.8768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1499.4840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.0323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1502.9250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2131.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2206.2964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1045.6046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.6978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2259.3354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1647.7009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.0942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.0058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1993.4044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1907.3651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1933.8279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.3693, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.10149253731343288
Sentence level Krippendorff's alpha for Premises:  0.27462686567164174
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 451
	Prediction sentences having premises: 121
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 65
	Prediction Sentence having neither claim nor premise: 163
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 369
	Sentences having claim in only one of reference or prediction: 301
	Sentences having premise in both reference and prediction: 427
	Sentences having premise in only one of reference or prediction: 243
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(993.1010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1439.6505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.6558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1056.4106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1040.0684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.8584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2381.5054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1810.4380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2575.3364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.5517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.1786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1561.3357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.8256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1334.7372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2297.2764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1362.8699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.4436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1597.5740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(972.9134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.5396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.5916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(581.2968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(918.2301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1028.2542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1427.7959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1156.7720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.6851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1465.3008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.5334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.2174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.0737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1847.2935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1230.0151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.1973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.4083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.0883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1302.1768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3022.7209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2107.6487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.2590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1017.7001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.0966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1019.2126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1603.5471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1842.1223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(880.7104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1178.4894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1845.0411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1361.4463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.8391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(848.7131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1773.1998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1655.9712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.3887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.3011, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.12238805970149258
Sentence level Krippendorff's alpha for Premises:  0.26865671641791045
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 420
	Prediction sentences having premises: 83
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 40
	Prediction Sentence having neither claim nor premise: 207
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 376
	Sentences having claim in only one of reference or prediction: 294
	Sentences having premise in both reference and prediction: 425
	Sentences having premise in only one of reference or prediction: 245
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(956.8978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.2548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1961.6206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.6905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(833.9031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.0734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.9626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1825.2102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1464.3136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2242.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.2963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.4303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.4768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.9662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1960.4634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.6201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.5100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1225.8632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(750.0251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.1240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.6711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.3553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.0870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.9090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1172.6116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.9438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.5872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1236.9431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.2040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.5791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.7014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1694.1780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.3430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.4446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(975.5326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.9397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.3242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.1317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2597.0210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1625.9795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1039.5391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.1260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.0095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1212.5828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1526.7163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.8105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.0321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1524.9518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.5776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.0908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.8124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1597.0513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1375.2876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1455.2797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.6381, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.12238805970149258
Sentence level Krippendorff's alpha for Premises:  0.26567164179104474
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 468
	Prediction sentences having premises: 80
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 42
	Prediction Sentence having neither claim nor premise: 164
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 376
	Sentences having claim in only one of reference or prediction: 294
	Sentences having premise in both reference and prediction: 424
	Sentences having premise in only one of reference or prediction: 246
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(802.4992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.5148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1600.0493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.0569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.5932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.6569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.3814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.5637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1157.2726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1768.3794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.0331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.9747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.7454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(712.0587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1728.8052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.4108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.0237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1006.3950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.6453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.3264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.3143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.1895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.3270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.4681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.5112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.9838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.7234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.3989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.8832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.8575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1345.5518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.6040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.3635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.4702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.4859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.3096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.9641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2097.6597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.9875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(787.7382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.5627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.0546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.1260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.8726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1513.5457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.9697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.5450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.6290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.3018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.7020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.5004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1339.1447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1074.1532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1240.1467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.1806, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.11044776119402988
Sentence level Krippendorff's alpha for Premises:  0.35522388059701493
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 510
	Prediction sentences having premises: 128
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 66
	Prediction Sentence having neither claim nor premise: 98
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 372
	Sentences having claim in only one of reference or prediction: 298
	Sentences having premise in both reference and prediction: 454
	Sentences having premise in only one of reference or prediction: 216
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(622.2133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.5374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.1514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.8124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.0772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.2006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.0630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1070.4938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(824.7477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.0410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.6782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.6561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.6271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.4948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1527.4561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.0500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.5325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.0329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.4486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.9769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.1111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.2509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.0581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.6832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.2681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.9510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.4396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1006.5085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.7842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.8859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.2562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.8271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.6785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.9437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.4667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.9592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.2471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.4021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1749.4105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(853.1312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.0742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.7523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.6831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.2857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(705.6995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.8806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.8699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.2737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.6982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.4220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.7739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.2283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.7645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(997.1877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.9327, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.15522388059701497
Sentence level Krippendorff's alpha for Premises:  0.3731343283582089
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 499
	Prediction sentences having premises: 104
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 65
	Prediction Sentence having neither claim nor premise: 132
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 387
	Sentences having claim in only one of reference or prediction: 283
	Sentences having premise in both reference and prediction: 460
	Sentences having premise in only one of reference or prediction: 210
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(528.3911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.5050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.5591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.0452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.9807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.8935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.3307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.2611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.5554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.5688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.8873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.2786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.5649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.7702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1232.1223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.5175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.2880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.9126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.2811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.0572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.7903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.7051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.4634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.7972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.5442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.5164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.1470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.1184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.6974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(873.9962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.7556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.1810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.0580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.2519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.3969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1616.8755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.7125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.5594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.6215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.0379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.2691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.3433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.7982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.2243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.2106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.8047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.1914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.6714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.1580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.9067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(888.9404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.6537, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.12238805970149258
Sentence level Krippendorff's alpha for Premises:  0.30746268656716413
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 514
	Prediction sentences having premises: 94
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 57
	Prediction Sentence having neither claim nor premise: 119
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 376
	Sentences having claim in only one of reference or prediction: 294
	Sentences having premise in both reference and prediction: 438
	Sentences having premise in only one of reference or prediction: 232
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(459.6221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.2870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.6975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.5891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.1996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.6663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.5039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.9227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1233.7372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.2818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.7320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.0594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.3027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.2593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1508.0468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.4323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.1096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.5093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.4702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.4840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.4177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.5405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.1835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.6949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.5353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.5093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.1396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.9945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.9729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.7681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.2280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.7396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.9076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.6935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1269.5898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.5795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.3229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.5074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.5042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.4449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.9964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.1912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.0352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.2422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.1914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.4210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.6095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1689.9813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1407.2698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.3831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.4721, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5074626865671642
Sentence level Krippendorff's alpha for Premises:  0.4477611940298507
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 253
	Prediction sentences having premises: 389
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 85
	Prediction Sentence having neither claim nor premise: 113
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 505
	Sentences having claim in only one of reference or prediction: 165
	Sentences having premise in both reference and prediction: 485
	Sentences having premise in only one of reference or prediction: 185
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(576.0064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(505.1601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.8529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.4159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.5016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.7906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.2869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.0431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.3389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.0135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.0020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.7217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.6495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.1560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.5522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1138.1262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(963.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.8212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.4951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.0169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.4911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.8513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.6472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.9573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.8604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.5344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.7308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.5722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.7142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(774.3060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.5656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.1194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.7364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.8466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.7040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1143.7968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.4550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.8650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.5040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.3129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.4092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.9907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.5118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.8544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.3394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.5322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.5944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(940.5953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.6750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.2897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.8667, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3641791044776119
Sentence level Krippendorff's alpha for Premises:  0.22089552238805965
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 81
	Prediction sentences having premises: 545
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 39
	Prediction Sentence having neither claim nor premise: 83
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 457
	Sentences having claim in only one of reference or prediction: 213
	Sentences having premise in both reference and prediction: 409
	Sentences having premise in only one of reference or prediction: 261
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(915.0449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.8382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.6241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.0767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(709.4946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(912.4287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2080.5605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1231.2805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1594.9838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.6841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.9351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.4127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.3546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.7834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1011.2136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(556.3977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.2281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.2131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.9114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.9951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.3682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.2943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.9059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.3830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.9321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.3328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.7242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.8491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(522.8051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.1337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.9893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.3738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.0066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.5208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.3749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.2498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.8113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.6576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1795.7639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(898.1925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.0652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.0289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.5865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.7896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(581.6304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1247.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.5530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.6014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(556.3659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.9589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.7160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.4453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.2548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.4572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.4706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.4388, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4895522388059701
Sentence level Krippendorff's alpha for Premises:  0.417910447761194
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 301
	Prediction sentences having premises: 373
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 110
	Prediction Sentence having neither claim nor premise: 106
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 499
	Sentences having claim in only one of reference or prediction: 171
	Sentences having premise in both reference and prediction: 475
	Sentences having premise in only one of reference or prediction: 195
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(328.4164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.8128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.7178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.1732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.7913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.3229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.0540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.6771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.7360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.8813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(712.1464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.4362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.5762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.4905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.7249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.2209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.1920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.9494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.8332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.0092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.3786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.1745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.4186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.2304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.6997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.3125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.2533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.5343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.3398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.7719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.8875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.2308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.5217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.1369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.2253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.3723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.2832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.8216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.4827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.3313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.6120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.6663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(490.7263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.7696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.8277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.0576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.8839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.9828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.6284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.2045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.4769, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4626865671641791
Sentence level Krippendorff's alpha for Premises:  0.4716417910447761
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 264
	Prediction sentences having premises: 335
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 79
	Prediction Sentence having neither claim nor premise: 150
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 490
	Sentences having claim in only one of reference or prediction: 180
	Sentences having premise in both reference and prediction: 493
	Sentences having premise in only one of reference or prediction: 177
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(317.4940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.6206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.3129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.5392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.8648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.7290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.5894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.2383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.8241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.8974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.8547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.2490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.8742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.3209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.7281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.6887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.3178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.0781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.5218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.5020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.6938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.1244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.8450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.0054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.7162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.9324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.4226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.8969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.0700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.4877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.2256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.2137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.4692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.6537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.1240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.5999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.1405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.1981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.9736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.2044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.8205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.9935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.0217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.8388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.2908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.0010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.5833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.0594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.0934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.1642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.9021, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4537313432835821
Sentence level Krippendorff's alpha for Premises:  0.3970149253731343
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 139
	Prediction sentences having premises: 414
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 43
	Prediction Sentence having neither claim nor premise: 160
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 487
	Sentences having claim in only one of reference or prediction: 183
	Sentences having premise in both reference and prediction: 468
	Sentences having premise in only one of reference or prediction: 202
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(239.4874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.2542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.8189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.8440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.1165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.5471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.6965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.5606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.4898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.4257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.1345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.0080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.2063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.3116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.4120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.3911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.8349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.8068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.7197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.2659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.7914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.4197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.8345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.7012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.5541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.3073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.8173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.5743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.8530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.2006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.3215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.6175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.1240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.6842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.5849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.1546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(941.1290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.4680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.5455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.9108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.7742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.1489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.9512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.5220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.6291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.4843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.6493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.9853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.4490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.7214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.8655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.4260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4415, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5074626865671642
Sentence level Krippendorff's alpha for Premises:  0.4776119402985075
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 247
	Prediction sentences having premises: 375
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 131
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 505
	Sentences having claim in only one of reference or prediction: 165
	Sentences having premise in both reference and prediction: 495
	Sentences having premise in only one of reference or prediction: 175
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(155.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.8578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.8565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.4619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.9263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.8297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.0972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.2770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.1398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.5663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.4462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.3165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.8327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.0110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.7630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.3846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.3064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.5096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.8973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.9699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.3296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.9742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.4272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.6936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.4982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.5450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.6368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.4944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.8541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.4227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.4476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.7233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.3601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.4966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.0929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.0426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.0308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.5718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.6046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.3866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.5162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.7477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.1698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.7311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.1318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.8608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.1080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.7517, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4119402985074627
Sentence level Krippendorff's alpha for Premises:  0.5313432835820895
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 331
	Prediction sentences having premises: 351
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 103
	Prediction Sentence having neither claim nor premise: 91
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 473
	Sentences having claim in only one of reference or prediction: 197
	Sentences having premise in both reference and prediction: 513
	Sentences having premise in only one of reference or prediction: 157
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(190.2431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.6955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.9930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.8797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.3159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.2809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.1220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(382.1578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.3609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.2917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.7655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.2267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.4714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.3948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.4419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.0807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.7054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.1134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.5411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.3140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.7260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.6331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.9590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.4270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.7187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.3513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.7870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.6970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.4360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.5100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.5706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.8147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.9166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.6038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.2002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.3434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.2110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.4625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.9033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.3072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.6363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.6029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.8561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.1662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.3079, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5014925373134329
Sentence level Krippendorff's alpha for Premises:  0.4656716417910448
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 253
	Prediction sentences having premises: 371
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 86
	Prediction Sentence having neither claim nor premise: 132
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 503
	Sentences having claim in only one of reference or prediction: 167
	Sentences having premise in both reference and prediction: 491
	Sentences having premise in only one of reference or prediction: 179
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(95.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.1933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.5554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.7923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.5551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.6917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.7659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.6453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.1821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.2269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.7771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.8645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.0053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.5463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.2535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.3234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.8670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.1123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.3792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.3636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.7346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.6772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.0913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.3834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.1190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.3921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.1824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.0134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.4864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.0027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.3290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.8343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.2592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.0578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.4618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.1707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.6787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.6366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.9141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.6169, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4805970149253731
Sentence level Krippendorff's alpha for Premises:  0.5223880597014925
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 270
	Prediction sentences having premises: 342
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 77
	Prediction Sentence having neither claim nor premise: 135
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 496
	Sentences having claim in only one of reference or prediction: 174
	Sentences having premise in both reference and prediction: 510
	Sentences having premise in only one of reference or prediction: 160
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(73.5585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.7551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.0511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.6651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.5304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.2244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.4123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.4200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.0451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.2440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.3178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.0512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.7192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.2787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.3738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.5898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.0257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.7156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.5696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.5659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.4538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.0277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.9830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.8535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.6043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.8771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.5384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.4023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.0394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.2441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.8394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.1882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.8681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.2836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.1193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.6667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.3725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.8454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.7766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.5585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.1075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1440, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4776119402985075
Sentence level Krippendorff's alpha for Premises:  0.4955223880597015
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 259
	Prediction sentences having premises: 357
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 71
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 495
	Sentences having claim in only one of reference or prediction: 175
	Sentences having premise in both reference and prediction: 501
	Sentences having premise in only one of reference or prediction: 169
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(58.0945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.8305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.8178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.7729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.5928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.2508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.2178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.7597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.2256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.0340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.7444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.3715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.5668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.1694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.5841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.6712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.6534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.3349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.2952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.6912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.7074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.8260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.0456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.3689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.7702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.9053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.0307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.3137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.4529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.3759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.3267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.4521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.7902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.3171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.4981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.7024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3009, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4626865671641791
Sentence level Krippendorff's alpha for Premises:  0.5343283582089553
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 280
	Prediction sentences having premises: 342
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 75
	Prediction Sentence having neither claim nor premise: 123
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 490
	Sentences having claim in only one of reference or prediction: 180
	Sentences having premise in both reference and prediction: 514
	Sentences having premise in only one of reference or prediction: 156
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(46.8400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.6444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.9672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.4288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.5791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.5770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.9276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.6043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.7102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.5043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.4139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.9829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.4015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.5346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.2582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.8198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.9039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.4786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.8809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.2030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.0634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.1209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.7417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.6837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.4708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.9808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.2809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.6322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.1315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.3081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.5538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.0084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.8465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.6124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.2728, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4805970149253731
Sentence level Krippendorff's alpha for Premises:  0.5134328358208955
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 272
	Prediction sentences having premises: 351
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 78
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 496
	Sentences having claim in only one of reference or prediction: 174
	Sentences having premise in both reference and prediction: 507
	Sentences having premise in only one of reference or prediction: 163
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(37.9488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.5463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.4850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.8640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.3070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.9872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.0713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.4751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.5856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.8768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.3253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.2450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.7279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.2995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.6751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.2752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.9968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.5687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.3706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.4390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.9257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.9919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.7832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.4681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.5978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.7779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.0560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.6608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.2645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.7728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.2760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.5751, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5074626865671642
Sentence level Krippendorff's alpha for Premises:  0.5044776119402985
Additional attributes: 
	Total Sentences: 670
	Prediction setences having claims: 277
	Prediction sentences having premises: 352
	Reference setences having claims: 248
	Reference sentences having premises: 292


	Prediction Sentence having both claim and premise: 80
	Prediction Sentence having neither claim nor premise: 121
	Reference Sentence having both claim and premise: 44
	Reference Sentence having neither claim nor premise: 174


	Sentences having claim in both reference and prediction: 505
	Sentences having claim in only one of reference or prediction: 165
	Sentences having premise in both reference and prediction: 504
	Sentences having premise in only one of reference or prediction: 166
				 Metric computations: None


		-------------RUN 2-----------
			------------EPOCH 1---------------
Loss:  tensor(3071.7847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2873.2134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4142.8218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3150.7559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.9459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1675.7727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1730.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1722.7911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1743.3461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2400.7754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2876.4055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1763.5959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1283.7578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(936.7993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1138.1633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1958.8873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3787.6118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2966.9148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.0807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.7560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1901.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2169.9114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1534.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2520.3052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1575.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2207.8789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1364.1754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1662.6448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2437.8562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2025.6562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2371.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.5112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.3076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2126.8269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1378.4517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1390.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2461.7920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1787.2407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1921.6357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2109.6416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1202.4668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2088.7300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1975.7296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.0956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1843.2649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.9736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1790.8191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1618.8345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1610.0959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2092.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2781.6597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2779.1143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1040.9913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2174.3899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.4562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1542.7427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2568.9954, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2566371681415929
Sentence level Krippendorff's alpha for Premises:  0.303834808259587
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 288
	Prediction sentences having premises: 201
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 13
	Prediction Sentence having neither claim nor premise: 202
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 426
	Sentences having claim in only one of reference or prediction: 252
	Sentences having premise in both reference and prediction: 442
	Sentences having premise in only one of reference or prediction: 236
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2414.3276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1967.0991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2848.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2146.8225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.7404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1244.6143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.7617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.0330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1362.6792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2128.7605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2128.9658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1262.4186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(950.5034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.7791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.3627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.2014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3365.4675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2497.5151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(915.7495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.7753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1453.3447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1685.3943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1145.5974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2063.4873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1431.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1677.5774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.1255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1460.7078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1873.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.5938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1996.8340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.9216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.8770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.3818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1660.2991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1153.7806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.2812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1782.1726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1511.0615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1467.3011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1767.8103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1017.0388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1817.8375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.6089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.6513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.3977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.0137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1310.1465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1291.8015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.2314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1690.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2347.0337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2412.0901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.6440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1755.0071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.1071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1158.2522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2211.5742, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.303834808259587
Sentence level Krippendorff's alpha for Premises:  0.3097345132743363
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 260
	Prediction sentences having premises: 267
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 33
	Prediction Sentence having neither claim nor premise: 184
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 442
	Sentences having claim in only one of reference or prediction: 236
	Sentences having premise in both reference and prediction: 444
	Sentences having premise in only one of reference or prediction: 234
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1774.0704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1657.9429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2303.8149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1867.3706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.8838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(997.3984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.9218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1121.9343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1811.5657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1815.1604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.0359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.7018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.7539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.2944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1277.1375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3022.5195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2145.1924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(730.7451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.4146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.7229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1379.4204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.9623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1761.3557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.1823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1295.9709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.3014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1233.6779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.0100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1157.2415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1701.5249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.3856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.8096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.7657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1272.1219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.1326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1048.8118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.1118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1273.3530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1168.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1454.3319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(878.2651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1673.6837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1492.8333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.5356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1286.4238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.1051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.8197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.1329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1135.4487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1412.8348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1807.4066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1983.1195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.4293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1393.2593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.8215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.6959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1774.3008, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.36283185840707965
Sentence level Krippendorff's alpha for Premises:  0.38053097345132747
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 290
	Prediction sentences having premises: 273
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 29
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 462
	Sentences having claim in only one of reference or prediction: 216
	Sentences having premise in both reference and prediction: 468
	Sentences having premise in only one of reference or prediction: 210
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1158.6442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.3970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1823.8591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.6760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.1546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(860.9090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.7871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1392.7726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1572.0300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.0790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.8851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.3705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1038.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2340.4150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1574.4845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.0110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.5780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.1993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.8735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.5270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1496.9137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1107.4751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.5507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.1539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.6674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.9540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.9529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1503.5120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.1497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.0276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(990.2239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.8966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1141.9200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1067.8989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.5783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1262.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(756.1055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1243.3767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1246.5170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.8485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1184.7046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.9581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.1414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(873.9175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.1748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1279.3591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1562.7566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1671.5642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.6557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.5164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.8383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.0402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1475.8467, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.528023598820059
Sentence level Krippendorff's alpha for Premises:  0.327433628318584
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 120
	Prediction sentences having premises: 467
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 35
	Prediction Sentence having neither claim nor premise: 126
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 518
	Sentences having claim in only one of reference or prediction: 160
	Sentences having premise in both reference and prediction: 450
	Sentences having premise in only one of reference or prediction: 228
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(897.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.9625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1398.3245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1556.6829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.2637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.9965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.5769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.6395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.8125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1278.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.5409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.6057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.3661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.1510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.0084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2073.7908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.2500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.4097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.8220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.8527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.6212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1265.4100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.2074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.3588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.6177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.1752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.4512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(663.9509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.4023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.3671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.2289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.4509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.7622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.4178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.6390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1040.3577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1077.8041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.4575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.9724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.7155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1270.7823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.6738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.7654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(756.2130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.1284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.3990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.3538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.1212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.9500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1305.0281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.7043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.3146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.0507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.6299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.5089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.5859, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.21533923303834812
Sentence level Krippendorff's alpha for Premises:  0.3126843657817109
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 436
	Prediction sentences having premises: 154
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 39
	Prediction Sentence having neither claim nor premise: 127
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 412
	Sentences having claim in only one of reference or prediction: 266
	Sentences having premise in both reference and prediction: 445
	Sentences having premise in only one of reference or prediction: 233
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(900.7333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1359.7268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1523.7887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1425.1074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.6742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.9818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.9747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.2357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.6154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.8626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.5508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.4208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.6584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.2141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.5622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.8406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.6847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1216.5518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.1254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.4202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.8612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(940.4902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.9349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.3507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.4261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.1265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.4412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.2733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.7410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1088.5117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.7830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.6089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.8948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.9044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.1148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.4122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.1704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.2249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.4806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.6688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.4174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.2979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.4854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.9334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.5869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.2293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(752.6071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(850.2010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.2444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1251.4788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.0959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.5103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.0557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.4141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.3571, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46607669616519176
Sentence level Krippendorff's alpha for Premises:  0.3893805309734514
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 233
	Prediction sentences having premises: 352
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 44
	Prediction Sentence having neither claim nor premise: 137
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 497
	Sentences having claim in only one of reference or prediction: 181
	Sentences having premise in both reference and prediction: 471
	Sentences having premise in only one of reference or prediction: 207
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(394.4739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.8193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.0144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.1715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.1464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.3664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.2923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.8276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.5477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.7168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.2816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.4168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.4513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.9415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1390.4891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.7122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.8762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.4766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.7115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.5535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.6098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.4690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.9762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(463.1506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.2713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.7989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.1394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.9889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.4000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.6571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.5854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.2474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.8190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.6555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.6771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.8068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.8628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.8174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.3925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.2455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(897.5380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.9492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.3564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.4141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.9853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.5399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.4368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.6100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.6919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(964.2089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.8029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.9963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.4237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.4107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.3672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.0649, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.37168141592920356
Sentence level Krippendorff's alpha for Premises:  0.3952802359882006
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 317
	Prediction sentences having premises: 334
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 60
	Prediction Sentence having neither claim nor premise: 87
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 465
	Sentences having claim in only one of reference or prediction: 213
	Sentences having premise in both reference and prediction: 473
	Sentences having premise in only one of reference or prediction: 205
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(406.2574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.0312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.1332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1003.7622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.4380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.9050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.1250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.5191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.6357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.1551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.2217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.6873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.5763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.8036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1307.1034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.8478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.4856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.6110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.7286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.1917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.2516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.7405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.0617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.0143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.9844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.8627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.5189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.8688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.3897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.1268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.1604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.9548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.6761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(505.5767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.2137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.7080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.9197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.4174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.2012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.1683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.7124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.5775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.8663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.9151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.4050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.1427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.9448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.1388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.4493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.3205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.8147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.2377, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43067846607669613
Sentence level Krippendorff's alpha for Premises:  0.37168141592920356
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 245
	Prediction sentences having premises: 388
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 49
	Prediction Sentence having neither claim nor premise: 94
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 485
	Sentences having claim in only one of reference or prediction: 193
	Sentences having premise in both reference and prediction: 465
	Sentences having premise in only one of reference or prediction: 213
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(280.2823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.8571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.9520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.4616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.1683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.4820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.3432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.2851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.3574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.1533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.0135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.5827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.1598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.1671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1467.8125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.8210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.2808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.8482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.7971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.4941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.5387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.3942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.0123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.3551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.0196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.3549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.2164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.7248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.9365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.8426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.2719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.5844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.7557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.6481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.3753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(340.9836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.8001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.2581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.3657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.3335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.2634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.0801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.0586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.8048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.9755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.7498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.4807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.9440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(863.2698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(918.6323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1022.2516, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5191740412979351
Sentence level Krippendorff's alpha for Premises:  0.32448377581120946
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 113
	Prediction sentences having premises: 494
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 38
	Prediction Sentence having neither claim nor premise: 109
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 515
	Sentences having claim in only one of reference or prediction: 163
	Sentences having premise in both reference and prediction: 449
	Sentences having premise in only one of reference or prediction: 229
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(309.3895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(490.2927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.2821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(946.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.2554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.2130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.0331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.8725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.7364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.6406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.5005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.4190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.8292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1267.2910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.4299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.3123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.4881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.3676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.0331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.2305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.3036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.6393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.8485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.5996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.4988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.2227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.9321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1436.4915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.6126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.1852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.8940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.3262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.7473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(730.9141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.1988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.0143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.6031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.8671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.8847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.1204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.9116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.9443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.1168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.9077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.8090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.1029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.5596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.3349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.8694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.8822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.0463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.8275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.1309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.2947, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48672566371681414
Sentence level Krippendorff's alpha for Premises:  0.25073746312684364
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 46
	Prediction sentences having premises: 543
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 12
	Prediction Sentence having neither claim nor premise: 101
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 504
	Sentences having claim in only one of reference or prediction: 174
	Sentences having premise in both reference and prediction: 424
	Sentences having premise in only one of reference or prediction: 254
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(715.3438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.6320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.4465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.1636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.2496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.4302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.0384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.0646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1116.7424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1366.7556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.3434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.0566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.6238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.5260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.1414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1807.6162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.8263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.6573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.9629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.0597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.6004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.2712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.7960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.4461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.0005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.8622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.6500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.4935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.9335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.8767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.9791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.0812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.5883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.9297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.7040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.6782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.4830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.0088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.3858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.0840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.2550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.1617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.2572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.0402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.6476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.9716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.8695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.7526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.3138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.3033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.1154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.4286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.7577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.0688, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.415929203539823
Sentence level Krippendorff's alpha for Premises:  0.2802359882005899
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 290
	Prediction sentences having premises: 347
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 59
	Prediction Sentence having neither claim nor premise: 100
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 480
	Sentences having claim in only one of reference or prediction: 198
	Sentences having premise in both reference and prediction: 434
	Sentences having premise in only one of reference or prediction: 244
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(186.8564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.3062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.3102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.4570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.6317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.1657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.7734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.0236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.5702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.1245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.6597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.4282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.7193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.5516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1371.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.1887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.2297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.6867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.3410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.9653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.9521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(624.9576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(522.7567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.8711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.2175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.5203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.5833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.2834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.4449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.4679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.7131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.9640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.4621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.3431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.5708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.8006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.5052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.0192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.0680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.5909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.4153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.5467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.4672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.7225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.2474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.1168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.0954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.4573, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5073746312684366
Sentence level Krippendorff's alpha for Premises:  0.37168141592920356
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 221
	Prediction sentences having premises: 380
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 62
	Prediction Sentence having neither claim nor premise: 139
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 511
	Sentences having claim in only one of reference or prediction: 167
	Sentences having premise in both reference and prediction: 465
	Sentences having premise in only one of reference or prediction: 213
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(192.2379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.1297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(546.6694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.9836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.7171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.4113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.0109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.0054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.8046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.6440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.3044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.7403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.6936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.9502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.0349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.0674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.3461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.0464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.6669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.3598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.8512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.5317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.4714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.4393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.2190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.0986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.0306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.3434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.6407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.1767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.0752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.1394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.3446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.1344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.1202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.5751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.0264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.2145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.2744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.1776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.8776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.8842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.4723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.0559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(522.1326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.7024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.1082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.8241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.5716, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5191740412979351
Sentence level Krippendorff's alpha for Premises:  0.42772861356932157
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 127
	Prediction sentences having premises: 403
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 37
	Prediction Sentence having neither claim nor premise: 185
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 515
	Sentences having claim in only one of reference or prediction: 163
	Sentences having premise in both reference and prediction: 484
	Sentences having premise in only one of reference or prediction: 194
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(193.2351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.5280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.9761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.7563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.3780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.0310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.2354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.4626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.8152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.5951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.4268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.7186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.0359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.8716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.4918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.0691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.0130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.6663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.2075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.8204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.7531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.2263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.1930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.9031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.8420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.5123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.2638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.1151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.2402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.0589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.9884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.7741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.8767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.8900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.0564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.8221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.8826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.7311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.0453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.7696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.6049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.7770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.7732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.4060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.2476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.2914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.1882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.1697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.8475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.9356, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48672566371681414
Sentence level Krippendorff's alpha for Premises:  0.359882005899705
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 230
	Prediction sentences having premises: 414
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 61
	Prediction Sentence having neither claim nor premise: 95
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 504
	Sentences having claim in only one of reference or prediction: 174
	Sentences having premise in both reference and prediction: 461
	Sentences having premise in only one of reference or prediction: 217
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(105.2044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.8524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.1170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.9127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.3642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.5849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.2911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.6234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.5306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.4883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.1318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.9064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.4481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.9764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.6705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.0099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.8791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.7603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.5450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.3420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.7889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.4947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.5693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.7769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.8622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.2140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.2758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.5499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.8487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.4254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.2222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.4860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.7699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.0175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.0063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.1548, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5339233038348082
Sentence level Krippendorff's alpha for Premises:  0.41002949852507375
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 190
	Prediction sentences having premises: 395
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 52
	Prediction Sentence having neither claim nor premise: 145
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 520
	Sentences having claim in only one of reference or prediction: 158
	Sentences having premise in both reference and prediction: 478
	Sentences having premise in only one of reference or prediction: 200
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(33.2396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.9025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.2621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.7200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.2760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.6211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.7447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.7690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.4808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.3273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.6892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.2114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.8238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.5838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.1354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.4255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.6690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.1952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.8025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.0601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.7844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.4156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.5599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.1661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.3909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.6015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.5521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.2307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.7469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.3673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.3369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.2309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.8724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.4502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.2457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.3012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8544, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5191740412979351
Sentence level Krippendorff's alpha for Premises:  0.41002949852507375
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 203
	Prediction sentences having premises: 399
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 52
	Prediction Sentence having neither claim nor premise: 128
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 515
	Sentences having claim in only one of reference or prediction: 163
	Sentences having premise in both reference and prediction: 478
	Sentences having premise in only one of reference or prediction: 200
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(18.5001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.1480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.4274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.2260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.1935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.3793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.2078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.9247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.7824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.2396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.4545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.5283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.0392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.9873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.4041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.5575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.2563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.5500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.2279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.0418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.1472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.5166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.9276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.1515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.6249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.7993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.3709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.9577, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5162241887905605
Sentence level Krippendorff's alpha for Premises:  0.415929203539823
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 206
	Prediction sentences having premises: 399
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 50
	Prediction Sentence having neither claim nor premise: 123
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 514
	Sentences having claim in only one of reference or prediction: 164
	Sentences having premise in both reference and prediction: 480
	Sentences having premise in only one of reference or prediction: 198
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(14.0106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.5620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.7604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.0956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.4702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.7865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.7896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.2071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.9391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.0100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.9083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.1293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.6403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.2571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.3552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.7178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.9625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.9319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.6934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.9153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.2425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.2085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.2010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.6917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.7519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.0249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.0198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.1050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.3000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.7867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.8353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.3617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.8132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.4973, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5132743362831859
Sentence level Krippendorff's alpha for Premises:  0.415929203539823
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 195
	Prediction sentences having premises: 405
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 47
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 513
	Sentences having claim in only one of reference or prediction: 165
	Sentences having premise in both reference and prediction: 480
	Sentences having premise in only one of reference or prediction: 198
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(9.6791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.6150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.1446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.6058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.3444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.4848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.4006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.3342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.0025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.0604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.2131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.9143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.6377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.5145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.3555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.6718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.1153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.5193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.2421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.0667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.6230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.5620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.1342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.2680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.5959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.4678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.3996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.7306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.1361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.9068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.9086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.8492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.0902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.0850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.3771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.5113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.3371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.1643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.8302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.1902, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5014749262536873
Sentence level Krippendorff's alpha for Premises:  0.41002949852507375
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 203
	Prediction sentences having premises: 395
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 45
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 509
	Sentences having claim in only one of reference or prediction: 169
	Sentences having premise in both reference and prediction: 478
	Sentences having premise in only one of reference or prediction: 200
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(7.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.3532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.2387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.5980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.7629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.1145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.8138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.7388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.6984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.4407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.7169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.7093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.5287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.4532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.9716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.9558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.0655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.3106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.1962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.9972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.4210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.8201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.9389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.0822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.9244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.3429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.9187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.4568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.3147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.5481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.0067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.7131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.3647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.6730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.8472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.0349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.7905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.0337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.5653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.6454, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5103244837758112
Sentence level Krippendorff's alpha for Premises:  0.41887905604719766
Additional attributes: 
	Total Sentences: 678
	Prediction setences having claims: 198
	Prediction sentences having premises: 402
	Reference setences having claims: 212
	Reference sentences having premises: 301


	Prediction Sentence having both claim and premise: 47
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 56
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 512
	Sentences having claim in only one of reference or prediction: 166
	Sentences having premise in both reference and prediction: 481
	Sentences having premise in only one of reference or prediction: 197
				 Metric computations: None


		-------------RUN 3-----------
			------------EPOCH 1---------------
Loss:  tensor(2253.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2098.4019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2746.9277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5073.8867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2361.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1728.0450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1652.9537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.2665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2183.1799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1530.8842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2620.5818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2662.7998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2595.0649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2653.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2117.6228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2287.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1845.7742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2144.4604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1545.1982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1965.5781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1686.0348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1376.6882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1571.8760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1897.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1504.8639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.3689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.8541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1822.5139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2597.7161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1305.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1452.3164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.8972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2189.8306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(878.3109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1839.1371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.2207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1066.1194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2482.5146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2805.6003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2070.5393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1223.7104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1640.2507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1623.2773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2087.1184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1692.3550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2463.7007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2172.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2234.9316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2542.5508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1512.3375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2385.4624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2010.8740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2276.0361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1727.2013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1052.9419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.8337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.4779, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24202626641651037
Sentence level Krippendorff's alpha for Premises:  0.2682926829268293
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 54
	Prediction sentences having premises: 296
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 2
	Prediction Sentence having neither claim nor premise: 185
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 331
	Sentences having claim in only one of reference or prediction: 202
	Sentences having premise in both reference and prediction: 338
	Sentences having premise in only one of reference or prediction: 195
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1602.4585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1706.6389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2058.8169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4064.2349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1840.4402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1303.4956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1302.8674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.1814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1674.6116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.9619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2227.1694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2197.8247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2282.9448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2291.8892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1736.7932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1879.8044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1669.5507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1741.9657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1296.4805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1620.3417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1397.8540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1174.4628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1189.0520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1424.1836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1265.6141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.2336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.5483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1449.4075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2167.4592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.0381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.6279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1309.8638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.1127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.1816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1537.2383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.7976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.7591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2172.4888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2422.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1976.4789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.9308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1339.1592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.3835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1703.0161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.6820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2123.9302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1609.8804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1897.8927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2270.2773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1159.9983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1920.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1825.5905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2039.3949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1518.1182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.1737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1324.2585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.8743, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3883677298311444
Sentence level Krippendorff's alpha for Premises:  0.41463414634146345
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 273
	Prediction sentences having premises: 247
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 86
	Prediction Sentence having neither claim nor premise: 99
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 370
	Sentences having claim in only one of reference or prediction: 163
	Sentences having premise in both reference and prediction: 377
	Sentences having premise in only one of reference or prediction: 156
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1235.4856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1518.1376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1727.0244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3510.0449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1548.6338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1073.3314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.5466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.0609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.9731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.7128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1803.5033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1820.8135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1943.6377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1984.4287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1492.0165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1616.0840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.1318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.3785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.7314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1335.6140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1174.6638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.5200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.5128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.0524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.2087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.0497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.5629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1272.0063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1874.3037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.8755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.8918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.5834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1639.2629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.3926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1201.7433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1327.3499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.7781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1863.0505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2047.1169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1604.7568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.6005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.1905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.5371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.1711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1028.6481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1833.0773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1352.4556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.9611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1759.2247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.8199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1698.9409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1455.3872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1644.6531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1243.3282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.4198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1128.0535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.7092, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4671669793621013
Sentence level Krippendorff's alpha for Premises:  0.46341463414634143
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 180
	Prediction sentences having premises: 320
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 73
	Prediction Sentence having neither claim nor premise: 106
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 391
	Sentences having claim in only one of reference or prediction: 142
	Sentences having premise in both reference and prediction: 390
	Sentences having premise in only one of reference or prediction: 143
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1040.9283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1203.1501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1409.1650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3073.6392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1285.2474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.8285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.6841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.1348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.5781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(624.9988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1264.3336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1278.7017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1531.6006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1540.9302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.8162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1408.8014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.0648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.5927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.9055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.6096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(981.7209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.7158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.3813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.0480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.2499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.3029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.7826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1004.5860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1540.8894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.8751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.6949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.4415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1453.4951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.3755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(974.4674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1123.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.5107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1673.6445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1741.4795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1224.8831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.2448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.9699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(498.3289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.5683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.9685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1491.8894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.8784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1330.0166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1418.5405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.2499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.6423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1153.5251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1345.1440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.5433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.5734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.3369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.5441, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.23452157598499057
Sentence level Krippendorff's alpha for Premises:  0.3696060037523452
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 362
	Prediction sentences having premises: 107
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 32
	Prediction Sentence having neither claim nor premise: 96
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 329
	Sentences having claim in only one of reference or prediction: 204
	Sentences having premise in both reference and prediction: 365
	Sentences having premise in only one of reference or prediction: 168
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(925.5168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.2654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1365.7842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2669.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.7740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.0317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.6530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.9303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.2004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.5010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.0579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1411.9524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1329.4463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1033.6921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.9858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.4008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.4532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.3505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.7107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.4727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.1513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(777.2577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.4698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.4583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.7487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.1324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1316.6713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.8594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.5347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.2767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.4170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.5317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.3672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1382.3271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1590.2046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.7549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.7397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.0376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.1949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.9597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.0906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.4935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(848.4887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(955.9398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.9219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.1130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1248.7659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(967.9239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.4902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.2682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.9047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.1625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.9931, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5121951219512195
Sentence level Krippendorff's alpha for Premises:  0.49718574108818014
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 194
	Prediction sentences having premises: 299
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 77
	Prediction Sentence having neither claim nor premise: 117
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 403
	Sentences having claim in only one of reference or prediction: 130
	Sentences having premise in both reference and prediction: 399
	Sentences having premise in only one of reference or prediction: 134
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(691.5281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.7589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.5495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2392.3176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.1919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.1427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.3184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.0159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.1902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.5876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.3638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.4753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(964.1281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.2581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(904.7905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.7737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.7161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.4580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.3482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.4232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.9548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.2438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.6070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.4937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.8039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.5186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(955.9459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.0594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.0188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.2614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.1780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.6499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.7042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.1412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.1204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1123.8713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.8328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.2358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.0068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.7758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.0934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.6724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.9886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.3432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.9210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.6028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(935.9316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.9592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.0476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.9152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.5433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.4069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.7681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.7312, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4371482176360225
Sentence level Krippendorff's alpha for Premises:  0.5084427767354597
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 252
	Prediction sentences having premises: 250
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 58
	Prediction Sentence having neither claim nor premise: 89
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 383
	Sentences having claim in only one of reference or prediction: 150
	Sentences having premise in both reference and prediction: 402
	Sentences having premise in only one of reference or prediction: 131
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(491.8535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.3423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.9630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1904.1477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.1407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.7437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.6666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.2057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.4807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.0433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.8105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.8884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.6357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.8418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.5278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1003.4033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.1938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.7519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.1834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.7097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.3912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.7473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.7365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.7732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.4691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.3078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.7047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.0065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.3877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.3604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.8720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.0049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(750.5500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.3488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.8900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.8802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.5498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(798.0942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.2397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.3222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.5410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.1465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.6618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.4240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.5587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.4105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.5283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.1484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.3135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.3707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.0250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.8002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.3161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.5653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.1469, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41463414634146345
Sentence level Krippendorff's alpha for Premises:  0.49718574108818014
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 286
	Prediction sentences having premises: 195
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 58
	Prediction Sentence having neither claim nor premise: 110
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 377
	Sentences having claim in only one of reference or prediction: 156
	Sentences having premise in both reference and prediction: 399
	Sentences having premise in only one of reference or prediction: 134
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(349.8886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.2130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.2828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1441.9656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.2904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.6523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.5502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.5575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.6460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.2380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.5940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.7490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.3731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.4245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(729.5388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1115.9973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.7803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.6010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.9675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.1720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.4597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.8947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.5790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.5298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.6393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.5643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.3098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.3781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.8065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.0706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.9712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.1137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.4600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.1350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.6682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.4108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.6591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.9999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.1026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.2303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.4252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.8624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.6135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.9536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.1479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.0127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.3771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.9969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.1853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.3839, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42964352720450283
Sentence level Krippendorff's alpha for Premises:  0.4521575984990619
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 232
	Prediction sentences having premises: 303
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 100
	Prediction Sentence having neither claim nor premise: 98
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 381
	Sentences having claim in only one of reference or prediction: 152
	Sentences having premise in both reference and prediction: 387
	Sentences having premise in only one of reference or prediction: 146
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(273.4978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.5693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.8555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.2126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.3095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.6273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.3959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.7899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.1209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.4673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.5270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.8252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.9657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.4641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.9252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.6988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.8282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.9989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.3576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.9210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.2873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.2751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.6207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.0773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.6484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.6240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.7504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.8949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.5448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.7009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.5844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.7940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.9769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.1914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.4301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.6650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.1124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.9383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.8100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.7576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.2261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.9302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.4899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.0986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.6896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.1957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.4177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.4424, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4934333958724203
Sentence level Krippendorff's alpha for Premises:  0.48968105065666045
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 197
	Prediction sentences having premises: 315
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 84
	Prediction Sentence having neither claim nor premise: 105
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 398
	Sentences having claim in only one of reference or prediction: 135
	Sentences having premise in both reference and prediction: 397
	Sentences having premise in only one of reference or prediction: 136
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(177.8662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.9878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.3568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.3382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.7175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.5968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.0674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.6767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.9120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.7428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.2870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.8275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.7125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.6203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.5972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.1892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.2413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.3779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.7393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.1948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.4698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.5778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.9063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.4064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.7658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.5825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.6825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.5373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.6886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.2053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.2023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.8042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.4500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.3662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.3522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.9679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.3858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.5124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.3079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.0150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.5575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.4345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.7504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.2748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.3990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.3254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.6065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.9839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.5618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.6099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.1797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7632, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4934333958724203
Sentence level Krippendorff's alpha for Premises:  0.4446529080675422
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 167
	Prediction sentences having premises: 309
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 68
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 398
	Sentences having claim in only one of reference or prediction: 135
	Sentences having premise in both reference and prediction: 385
	Sentences having premise in only one of reference or prediction: 148
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(167.5366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.5237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.3646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.4422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.5894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.0780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.5050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.6015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.4467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.5277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.6724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.4491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.7762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.8329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.6744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.9821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.9461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.2606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.7206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.3443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.4712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.5790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.1304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.9618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.9177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(546.8260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.8740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.0697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.6021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.5245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.2246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.4397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.2588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(777.6933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.4948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.7853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.2003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.8778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.1145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.1392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.7323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.0004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.1656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.5318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.0059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.0782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.6512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.8319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.6420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.3619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.9692, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4183864915572233
Sentence level Krippendorff's alpha for Premises:  0.42964352720450283
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 73
	Prediction sentences having premises: 353
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 30
	Prediction Sentence having neither claim nor premise: 137
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 378
	Sentences having claim in only one of reference or prediction: 155
	Sentences having premise in both reference and prediction: 381
	Sentences having premise in only one of reference or prediction: 152
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(332.6007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.3745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.4541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.8866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.2008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.0154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.8473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.9673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.8348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.7566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.8209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.8431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.2778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.1992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.1784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.6230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.2474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.5254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.3506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.2504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.1593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.4095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.8864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.6754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.4591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.4337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.6938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.2729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.9131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.3471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.9235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.5103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.6615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.0101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.2207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.4054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.0278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.0127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(581.3630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.6044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.3310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.6571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.9989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.4536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.1453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.6196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.9162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.2891, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.425891181988743
Sentence level Krippendorff's alpha for Premises:  0.3545966228893058
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 95
	Prediction sentences having premises: 387
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 43
	Prediction Sentence having neither claim nor premise: 94
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 380
	Sentences having claim in only one of reference or prediction: 153
	Sentences having premise in both reference and prediction: 361
	Sentences having premise in only one of reference or prediction: 172
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(148.9101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.9307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1039.2190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.1735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.5634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.3645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.3513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.5910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.1841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.5814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.4916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.9083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.1757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.9906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.9176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.8565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.7245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.7996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.7200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.1732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.0278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.0343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.8167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.9206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.0269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.2321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.5679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.6122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.6369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.3451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.0793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.6881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.1295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.7113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.1354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.8421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.0114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.1618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.8756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.5168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.2805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.2623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.3933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.3550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.9455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.2046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.0378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.6694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.4143, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4521575984990619
Sentence level Krippendorff's alpha for Premises:  0.49718574108818014
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 218
	Prediction sentences having premises: 311
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 86
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 387
	Sentences having claim in only one of reference or prediction: 146
	Sentences having premise in both reference and prediction: 399
	Sentences having premise in only one of reference or prediction: 134
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(229.0264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.4096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.2669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(995.8223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.4662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.5128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.7531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.7016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.1628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.8763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.5876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.0524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.5771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.3367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.0263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.5701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.4462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.3766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.7781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.3423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.7390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.7400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.6027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.1441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.9758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.5097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.6324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.5120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.7136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.7224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.6172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.8852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.8206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.3739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.2235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.1408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.0273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.8645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.0093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.9734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.2144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.2886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.8297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.5416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.9782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.8289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.6182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.2010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(498.6222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.8973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.0534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.3894, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33958724202626644
Sentence level Krippendorff's alpha for Premises:  0.5234521575984991
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 312
	Prediction sentences having premises: 200
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 102
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 357
	Sentences having claim in only one of reference or prediction: 176
	Sentences having premise in both reference and prediction: 406
	Sentences having premise in only one of reference or prediction: 127
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(147.6676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.3433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.3600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.5915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.7255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.7703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.8200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.8872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.5172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.6266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.3696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.4669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.1890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.7075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.3887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.2470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.8923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.4274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.0535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.6775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.7282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.9293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.4910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.5995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.9774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.4441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.2048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.4301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.8197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.5121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.8211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.6503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.4958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.1667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.9458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.8272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.3273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.5769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.5771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.0257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.6232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.5232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.5216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.1713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.1864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.5733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.7314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.4863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.2089, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.22326454033771104
Sentence level Krippendorff's alpha for Premises:  0.33958724202626644
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 373
	Prediction sentences having premises: 75
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 42
	Prediction Sentence having neither claim nor premise: 127
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 326
	Sentences having claim in only one of reference or prediction: 207
	Sentences having premise in both reference and prediction: 357
	Sentences having premise in only one of reference or prediction: 176
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(499.2173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.9211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1448.7068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.4672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.8858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.8299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.3412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.5768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.8488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.9398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.3580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.0840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.4601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.5516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.3629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.8687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.7573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.4985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.9186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.4577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.4575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.6315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.3131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.0028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.2696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.6426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.7733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.1677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.8166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.2792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.5544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.4897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(914.4401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.9357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.0668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.2423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.2346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.7547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.3056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.4636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.0543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.6473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.3425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.0939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.7390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.5601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.9819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.4713, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4371482176360225
Sentence level Krippendorff's alpha for Premises:  0.49718574108818014
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 266
	Prediction sentences having premises: 247
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 75
	Prediction Sentence having neither claim nor premise: 95
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 383
	Sentences having claim in only one of reference or prediction: 150
	Sentences having premise in both reference and prediction: 399
	Sentences having premise in only one of reference or prediction: 134
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(85.3768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.1537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.7768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.6647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.0937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.5651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.8478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.9569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.4713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.3840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.4440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.8333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.1848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.3072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.0602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.2332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.2014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.6333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.4537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.1300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.3450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.5870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.5540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.1614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.0136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.6729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.3296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.2457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.9525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.9661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.0888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.2166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.1658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.2360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.5800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.9053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.9516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.4746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.8009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.1024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.0455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.0497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.0057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.0215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2170, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4934333958724203
Sentence level Krippendorff's alpha for Premises:  0.5159474671669794
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 217
	Prediction sentences having premises: 252
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 66
	Prediction Sentence having neither claim nor premise: 130
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 398
	Sentences having claim in only one of reference or prediction: 135
	Sentences having premise in both reference and prediction: 404
	Sentences having premise in only one of reference or prediction: 129
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(93.2422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.1195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.8601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.9650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.6294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.1731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.8881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.5914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.1173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.9240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.0336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.4308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.0107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.2709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.4974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.5172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.7834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.5893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.2959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.8519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.3092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.1867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.4870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.6446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.9679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.3936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.7768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.2849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.0137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.9536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.1582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.5281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.1098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.7025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.1401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.3080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.6920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.2047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3789, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4521575984990619
Sentence level Krippendorff's alpha for Premises:  0.5121951219512195
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 240
	Prediction sentences having premises: 253
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 73
	Prediction Sentence having neither claim nor premise: 113
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 387
	Sentences having claim in only one of reference or prediction: 146
	Sentences having premise in both reference and prediction: 403
	Sentences having premise in only one of reference or prediction: 130
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(54.8939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.7080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.7995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.9266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.2309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.4878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.9598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.0008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.4669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.7181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.5682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.3851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.0851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.3461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.5014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.6787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.0375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.5226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.7490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.3896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.9015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.8091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.5962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.9888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.4242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.2258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.6330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.6780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.7460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.1083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.0871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.9103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.8482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.7464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.5452, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46341463414634143
Sentence level Krippendorff's alpha for Premises:  0.5347091932457786
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 241
	Prediction sentences having premises: 259
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 76
	Prediction Sentence having neither claim nor premise: 109
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 390
	Sentences having claim in only one of reference or prediction: 143
	Sentences having premise in both reference and prediction: 409
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(47.4014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.4400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.0439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.8008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.1265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.7342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.6791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.5149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.8398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.9359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.5911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.7823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.8278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.4168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.0316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.9535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.9982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.5892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.2968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.4017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.7236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.8724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.8482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.8956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.7897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.2927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.7843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.2315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.8215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.0186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.1377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.9198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.9684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.4310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.2225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7522, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44840525328330205
Sentence level Krippendorff's alpha for Premises:  0.5384615384615384
Additional attributes: 
	Total Sentences: 533
	Prediction setences having claims: 247
	Prediction sentences having premises: 252
	Reference setences having claims: 192
	Reference sentences having premises: 229


	Prediction Sentence having both claim and premise: 75
	Prediction Sentence having neither claim nor premise: 109
	Reference Sentence having both claim and premise: 40
	Reference Sentence having neither claim nor premise: 152


	Sentences having claim in both reference and prediction: 386
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 410
	Sentences having premise in only one of reference or prediction: 123
				 Metric computations: None


		-------------RUN 4-----------
			------------EPOCH 1---------------
Loss:  tensor(1738.3777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2617.9043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1997.7887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.2776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2358.1228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3546.1626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3787.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2605.1001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2641.2219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1460.6573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2624.8423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1807.8678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2770.9712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1138.0518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.4437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2181.7197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2566.5942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.4960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.1243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(884.2319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1651.9371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1679.0822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1888.4261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1306.0575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2635.9229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2384.4585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1116.5636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1666.4288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4505.6328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2706.8853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1494.2426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2822.4016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2953.1533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1461.0135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1874.8589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.3158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2387.6504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2630.6816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1238.5380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2992.6489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2291.8706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2448.1599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.5995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1548.9346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.9742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2029.0247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1408.3098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1855.1345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2210.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.2592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1128.6826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2020.0974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1392.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2754.4707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1566.7294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1711.2554, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  -0.008849557522123908
Sentence level Krippendorff's alpha for Premises:  0.32389380530973455
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 262
	Prediction sentences having premises: 28
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 2
	Prediction Sentence having neither claim nor premise: 277
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 280
	Sentences having claim in only one of reference or prediction: 285
	Sentences having premise in both reference and prediction: 374
	Sentences having premise in only one of reference or prediction: 191
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1273.2694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1882.5781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1615.9442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1092.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1809.1445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2860.9053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2816.4568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2370.3645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2164.5095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1174.8308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2146.9412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.5223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2490.0610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.6039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(752.5058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1723.0154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2242.4019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.4476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.6530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(747.8138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1205.1089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1300.6165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.9781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1082.5913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2158.0422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2061.0986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1436.1372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4004.3301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2253.3662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.6407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2473.8787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2507.7832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1280.4926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.8569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.3020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2032.8730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2306.0342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1035.4745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2471.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1936.0212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2098.1760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1489.9854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.5442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.9919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1669.8645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.1039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1477.6638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1815.2134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1007.4509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(959.1277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1694.0668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.7244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2407.8606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1193.4545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.4092, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.16106194690265485
Sentence level Krippendorff's alpha for Premises:  0.4336283185840708
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 228
	Prediction sentences having premises: 227
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 62
	Prediction Sentence having neither claim nor premise: 172
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 328
	Sentences having claim in only one of reference or prediction: 237
	Sentences having premise in both reference and prediction: 405
	Sentences having premise in only one of reference or prediction: 160
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1066.5681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1659.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1435.3008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.4368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1561.3798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2603.2642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2467.7959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2144.5264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1775.8722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(956.5775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1880.2874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1397.7952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2180.0996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.0839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.1605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1432.5161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1932.9065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.4891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(644.7321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.7198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1085.1660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1441.4695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.7070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1871.9386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1710.3723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.4655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1193.9814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3470.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1949.0372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.4458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2159.9346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2150.0620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.9904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1378.4514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.0017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1594.8154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1847.0964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1881.8188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1622.9424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1708.3962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.1978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.2253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.5916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1291.2135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(959.9172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1301.8271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.1599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.8269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1466.5231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.5959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1967.7950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.1776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.5857, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.21415929203539819
Sentence level Krippendorff's alpha for Premises:  0.46548672566371685
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 219
	Prediction sentences having premises: 230
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 41
	Prediction Sentence having neither claim nor premise: 157
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 343
	Sentences having claim in only one of reference or prediction: 222
	Sentences having premise in both reference and prediction: 414
	Sentences having premise in only one of reference or prediction: 151
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(864.0551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1412.2886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.8892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.8124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1313.7356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2209.5854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2118.6797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1599.2169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1433.8325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(774.1527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1601.6289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.2306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1749.4001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.6007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.8237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1201.2488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1582.5212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.2363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.4983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.6250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.6599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.9971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1211.9937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(853.9012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.6992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.3672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.2756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.7472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2888.7197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1573.3384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.3468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1785.0862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1908.8279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.6006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.3845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.7260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1284.6416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1627.9021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.4162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1530.7839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1355.9812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.9994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(987.4161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(936.2306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.0287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.3999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.6894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.2424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.8116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.3786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.6501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1158.7739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.9857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1615.6125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.7595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.7109, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.28495575221238933
Sentence level Krippendorff's alpha for Premises:  0.4690265486725663
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 213
	Prediction sentences having premises: 205
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 43
	Prediction Sentence having neither claim nor premise: 190
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 363
	Sentences having claim in only one of reference or prediction: 202
	Sentences having premise in both reference and prediction: 415
	Sentences having premise in only one of reference or prediction: 150
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(690.6007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1155.7159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.2970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.5139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1066.8235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1965.2448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1862.3706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1472.2960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1243.4612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.5322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1437.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(963.9606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1506.8567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.1903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.7194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.3356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1535.6261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.4153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.3741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.9486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.5035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.7494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.1843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.2183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.8105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1032.5563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.0825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(848.8250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2424.1096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1208.8599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.8860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1441.2676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1844.6102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(956.0169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.1099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.1196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.4443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1527.9546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.1674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1509.3267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.2385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.7664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.7575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(735.5238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.4722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.5236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.1189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.6713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.1708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.7080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.9708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1186.9607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.7735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1579.9766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.0834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(981.1436, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3132743362831858
Sentence level Krippendorff's alpha for Premises:  0.4123893805309734
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 167
	Prediction sentences having premises: 335
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 60
	Prediction Sentence having neither claim nor premise: 123
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 371
	Sentences having claim in only one of reference or prediction: 194
	Sentences having premise in both reference and prediction: 399
	Sentences having premise in only one of reference or prediction: 166
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(625.9107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.3481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.0725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1411.8678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2194.3220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1809.5813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1443.7797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.2632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.3059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1240.3551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(798.0960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1359.2136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.6589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.1298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(914.8478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1291.1846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.7750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.3770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.4567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.4571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.1487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1016.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.6476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1258.8123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1155.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.6958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.7415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2273.4663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1111.0145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.4347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1341.8755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.4329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.9729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.5148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.8310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(959.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.8473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.1191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.4574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(968.0953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(972.9667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.7202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.0448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.8127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.3416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.8482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.7764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.5756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.5228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.3412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.2068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1107.9670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(463.6357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(709.2073, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40884955752212393
Sentence level Krippendorff's alpha for Premises:  0.40530973451327434
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 148
	Prediction sentences having premises: 343
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 46
	Prediction Sentence having neither claim nor premise: 120
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 398
	Sentences having claim in only one of reference or prediction: 167
	Sentences having premise in both reference and prediction: 397
	Sentences having premise in only one of reference or prediction: 168
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(465.2239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.0291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.0393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.4105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.2700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1677.6167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.3173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1220.9219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(952.6925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.9636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1156.3799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(735.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(981.5072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.8757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.7493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.2944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.0289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.3795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.1232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.5154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.7728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.4651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.5620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.6240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.6764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.9575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.0090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.3606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1627.3645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.2261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.6079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.5542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.4050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.4377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.3785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.7523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.9456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(909.9232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.5413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.3834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.4808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.4912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.8479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.0363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.0422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.6000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.2819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.1738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.9871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.4560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.1166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.7871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.8696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.9590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.9285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.5688, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.327433628318584
Sentence level Krippendorff's alpha for Premises:  0.4194690265486726
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 157
	Prediction sentences having premises: 343
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 56
	Prediction Sentence having neither claim nor premise: 121
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 375
	Sentences having claim in only one of reference or prediction: 190
	Sentences having premise in both reference and prediction: 401
	Sentences having premise in only one of reference or prediction: 164
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(357.7294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.6636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.1367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.5173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.8672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.5143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.6387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.1013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.0391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1147.5396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.3622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.5344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.9311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.4419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.2476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(884.8436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.4750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.3133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.6370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.8008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.4396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(490.3638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.2808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.4504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.5507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.9164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.1118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.5042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.4762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.4521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.3035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.9393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.9428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.1626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.4052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.8698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(628.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.6075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.1208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.8431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.0850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.1650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(848.8847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.9408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.8354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.8837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.1945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.2476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1184.8218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.5445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.1998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.5542, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3734513274336283
Sentence level Krippendorff's alpha for Premises:  0.4938053097345133
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 188
	Prediction sentences having premises: 278
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 44
	Prediction Sentence having neither claim nor premise: 143
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 388
	Sentences having claim in only one of reference or prediction: 177
	Sentences having premise in both reference and prediction: 422
	Sentences having premise in only one of reference or prediction: 143
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(238.4866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.8286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.9209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.2305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.2980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.1948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.2799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.7376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.7239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.9402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.8361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.5491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(718.0063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.4213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.6664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.1104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(918.6064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.3534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.8085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.3662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.4680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.3686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(852.4219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.3859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.6292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.1006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.1006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2526.2500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.5714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.5454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1208.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.9019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.0390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.1699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.1801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.6254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.0370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.0836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.4618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.1235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.1057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.5431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.3708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.8523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.8919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.2336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.5117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.4114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.2564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(487.8373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.1357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.8676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1942.4058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1299.3861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1186.3821, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3345132743362832
Sentence level Krippendorff's alpha for Premises:  0.32389380530973455
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 269
	Prediction sentences having premises: 56
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 16
	Prediction Sentence having neither claim nor premise: 256
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 377
	Sentences having claim in only one of reference or prediction: 188
	Sentences having premise in both reference and prediction: 374
	Sentences having premise in only one of reference or prediction: 191
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(743.1837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1467.9750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.0874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.9337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.3464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2096.7422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2246.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1919.1888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1205.6584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1260.9218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.5983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.3425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.0323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.4826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.5092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.1536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.4413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.1203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.6389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.6668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.2786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.5551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.8319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.1546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.0794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.7137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.3397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1872.9028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.0714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.7911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.5132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1231.0553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.9545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.6373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.7397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.5722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1345.5455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1504.7719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.2888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.9644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.9016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.8897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.4112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.9515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.8768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.6986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.1763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.4639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.7416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.8672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.0364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.1802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.8284, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24247787610619465
Sentence level Krippendorff's alpha for Premises:  0.3946902654867257
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 349
	Prediction sentences having premises: 156
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 53
	Prediction Sentence having neither claim nor premise: 113
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 351
	Sentences having claim in only one of reference or prediction: 214
	Sentences having premise in both reference and prediction: 394
	Sentences having premise in only one of reference or prediction: 171
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(318.6121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.4152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.1088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.9091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.3151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1424.5269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1346.4899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(747.1012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.5027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.8029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.8308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.0155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.3314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.6384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.3546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.7496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.4037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.3864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.2215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.7902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.8535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.3790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.9579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.6943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.2446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.4116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1237.4226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.9130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.5056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.5032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.7719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.2747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.2490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.8781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.9561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.3139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.3610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.1910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.7498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.2582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.1720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.4285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.9439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.7181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.0116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.4669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.7890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.5408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.8831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.7966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.3405, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2778761061946903
Sentence level Krippendorff's alpha for Premises:  0.4796460176991151
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 301
	Prediction sentences having premises: 202
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 58
	Prediction Sentence having neither claim nor premise: 120
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 361
	Sentences having claim in only one of reference or prediction: 204
	Sentences having premise in both reference and prediction: 418
	Sentences having premise in only one of reference or prediction: 147
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(202.9387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.4637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.4280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.9904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.7401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.3376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(644.5388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.5657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.1972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.1279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.9253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.4394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.9632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.1644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.0142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.3026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.0346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.5181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.1832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.1089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.8574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.6457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.1388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.2582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.2039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.3986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.7443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.3211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.1425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.7654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.9866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.9728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.6431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.5514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.8805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.8008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.1337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.5503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.1214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.1031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.3526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.9589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.4714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.6793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.6706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.8622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.4682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(340.6801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.0167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.2568, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3097345132743363
Sentence level Krippendorff's alpha for Premises:  0.504424778761062
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 280
	Prediction sentences having premises: 209
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 52
	Prediction Sentence having neither claim nor premise: 128
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 370
	Sentences having claim in only one of reference or prediction: 195
	Sentences having premise in both reference and prediction: 425
	Sentences having premise in only one of reference or prediction: 140
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(140.0373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.4128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.1050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.9204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.7159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.4148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.6454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.3649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.0835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.0306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.1238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.4213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(340.6014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.8474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.9734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.9136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.9422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.8726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.8529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.5504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.1579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.6474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.2303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.5646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.3807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.8667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.6852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.8431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.7628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.5375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.6559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.3179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.3098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.2404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.8448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.6193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.3080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.0422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.2389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.3129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.5223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.3428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.9314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.7867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.5291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.4038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.9012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.3324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.0954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.7229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.9739, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3911504424778761
Sentence level Krippendorff's alpha for Premises:  0.5398230088495575
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 151
	Prediction sentences having premises: 289
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 39
	Prediction Sentence having neither claim nor premise: 164
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 393
	Sentences having claim in only one of reference or prediction: 172
	Sentences having premise in both reference and prediction: 435
	Sentences having premise in only one of reference or prediction: 130
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(154.0539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.7187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.3477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.3587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.3426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.1508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.1976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.1010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.8287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.4111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.5649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.6192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.9738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.2803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.3514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.4391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.7987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.7489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.1209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.9750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.8911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.2100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.6880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.3164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.7500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.1941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.8919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.9317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.9125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.2539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.2285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.9487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.9755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.7065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.9514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.8570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.1896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.9415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.6062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.0130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.9250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.6491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.9571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.9474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.0912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.7865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.8631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.4784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.1971, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3380530973451328
Sentence level Krippendorff's alpha for Premises:  0.5858407079646017
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 258
	Prediction sentences having premises: 242
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 69
	Prediction Sentence having neither claim nor premise: 134
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 378
	Sentences having claim in only one of reference or prediction: 187
	Sentences having premise in both reference and prediction: 448
	Sentences having premise in only one of reference or prediction: 117
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(73.8430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.9832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.6121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.3406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.5188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.8797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.8403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.2407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.7365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.3704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.6591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.3753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.8138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.3127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.2516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.8224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.6019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.6012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.6958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.7046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.6100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.4700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.0918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.8481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.9679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.7805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(861.8846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.1181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(766.0499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.1267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.4985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.3105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.0237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.0556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.2140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.3290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.8676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.1535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.7814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.7883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.0223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.2307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.4619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.2261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.3688, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.34513274336283184
Sentence level Krippendorff's alpha for Premises:  0.5504424778761061
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 272
	Prediction sentences having premises: 218
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 61
	Prediction Sentence having neither claim nor premise: 136
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 380
	Sentences having claim in only one of reference or prediction: 185
	Sentences having premise in both reference and prediction: 438
	Sentences having premise in only one of reference or prediction: 127
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(62.8495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.4601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.8476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.1300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(487.3145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.2657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.6981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.0995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.7229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.0386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.4700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.2908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.6938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.5609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.7566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.8584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.5684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.5813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.7759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.4259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.4874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.6991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.4191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.5263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.0496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.4973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.4573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.4298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.1434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.3253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.9881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.8974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.2827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.4568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.8165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.3012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.0816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.3339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.6108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.7651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.6412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.5845, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38761061946902653
Sentence level Krippendorff's alpha for Premises:  0.5716814159292035
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 200
	Prediction sentences having premises: 266
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 47
	Prediction Sentence having neither claim nor premise: 146
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 392
	Sentences having claim in only one of reference or prediction: 173
	Sentences having premise in both reference and prediction: 444
	Sentences having premise in only one of reference or prediction: 121
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(54.7948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.3648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.4276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.8624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.2361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.4931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.1052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.2363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.2635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.4625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.7190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.4881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.6274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.8362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.7851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.1394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.0710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.1096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.8816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.1744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.3170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.4707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.1785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.9636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.0776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.5197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.4312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.5260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.4381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.9751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.2024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.7922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.0388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.2759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.6253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.6553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.0660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.3179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.4328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.7905, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.352212389380531
Sentence level Krippendorff's alpha for Premises:  0.5610619469026549
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 256
	Prediction sentences having premises: 239
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 68
	Prediction Sentence having neither claim nor premise: 138
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 382
	Sentences having claim in only one of reference or prediction: 183
	Sentences having premise in both reference and prediction: 441
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(45.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.9399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.2258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.0815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.6494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.1404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.5679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.5585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.4831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.5525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.9875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.4049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.0490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.1766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.5974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.0658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.8689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.7137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.9711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.4172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.9889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.6655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.7728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.6283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.8124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.0755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.2415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.6334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.2820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.9784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.3744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.7759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.9056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.2028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.0938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.8248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.6924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.7473, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38053097345132747
Sentence level Krippendorff's alpha for Premises:  0.5716814159292035
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 210
	Prediction sentences having premises: 268
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 56
	Prediction Sentence having neither claim nor premise: 143
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 390
	Sentences having claim in only one of reference or prediction: 175
	Sentences having premise in both reference and prediction: 444
	Sentences having premise in only one of reference or prediction: 121
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(38.9930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.2663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.2227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.5655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.3103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.4772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.1897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.8611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.9892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.8657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.4906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.9346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.1954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.7156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.9772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.6786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.5592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.6111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.4025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.0184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.7059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.3745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.7107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.6616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.0891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.6351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.8084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.7466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.0370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.8958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.5046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.9120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.4067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.8964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.6549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.8090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.2005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.7210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.7095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.0499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.4928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.2850, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3557522123893805
Sentence level Krippendorff's alpha for Premises:  0.5752212389380531
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 233
	Prediction sentences having premises: 253
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 61
	Prediction Sentence having neither claim nor premise: 140
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 383
	Sentences having claim in only one of reference or prediction: 182
	Sentences having premise in both reference and prediction: 445
	Sentences having premise in only one of reference or prediction: 120
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(35.2949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.0744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.5795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.1915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.2941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.3917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.9296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.3506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.8101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.6388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.2413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.5986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.6703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.6994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.2899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.8815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.2169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.6261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.3236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(556.7025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.3808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.8856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.6837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.3359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.3012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.3795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.7270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.2318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.0222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.8937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.1571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.6336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.3023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.3858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.0150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.2279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.2135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5303, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.352212389380531
Sentence level Krippendorff's alpha for Premises:  0.5681415929203539
Additional attributes: 
	Total Sentences: 565
	Prediction setences having claims: 208
	Prediction sentences having premises: 271
	Reference setences having claims: 227
	Reference sentences having premises: 207


	Prediction Sentence having both claim and premise: 55
	Prediction Sentence having neither claim nor premise: 141
	Reference Sentence having both claim and premise: 41
	Reference Sentence having neither claim nor premise: 172


	Sentences having claim in both reference and prediction: 382
	Sentences having claim in only one of reference or prediction: 183
	Sentences having premise in both reference and prediction: 443
	Sentences having premise in only one of reference or prediction: 122
				 Metric computations: None


		-------------RUN 5-----------
			------------EPOCH 1---------------
Loss:  tensor(2880.5430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1835.0132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3389.4946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2905.7896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1876.6409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2268.4592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2200.0215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2182.3259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2240.2383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1956.1478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1870.3785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1607.8969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2690.5811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2811.6304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3139.8604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1172.8840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.1611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(956.2604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1688.9907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1186.5679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1843.9625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2537.5361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1764.4052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1277.5634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.0483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2179.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1448.9597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2805.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2760.1685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.1145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1848.9470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1271.1458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2083.8403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2036.5819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.4087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1885.8561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1278.9009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2037.8719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2070.5437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(934.4253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.1283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.1166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2551.0825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2142.3804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1323.4536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1593.3463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1395.6975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2240.3804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2680.4844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2961.2720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1761.9757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.5811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.7935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.6958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1337.5247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1641.8999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1667.0869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1912.7748, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.16157205240174677
Sentence level Krippendorff's alpha for Premises:  0.2954876273653566
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 311
	Prediction sentences having premises: 260
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 72
	Prediction Sentence having neither claim nor premise: 188
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 399
	Sentences having claim in only one of reference or prediction: 288
	Sentences having premise in both reference and prediction: 445
	Sentences having premise in only one of reference or prediction: 242
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2036.3690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1238.9529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2648.9722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2236.1362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1463.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1779.7206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.8140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1606.4844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1631.4573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1468.9382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1446.8723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1220.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2273.2856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2140.7507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2531.8884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.0521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(940.5930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.7118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.4695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.0182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1536.5898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2311.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1590.5350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.7419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1665.9640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1948.7742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1145.9341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2288.3088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2325.6514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1344.9751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1506.3002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.9517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1969.7284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1773.7988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(892.0455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1606.5953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1096.5989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1848.4258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1994.6345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.2750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.8354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1102.2837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2266.9395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1769.6185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.3071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1250.7395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2083.6428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2057.3875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2472.1440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.1073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1915.1196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.8024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.0835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.2672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1252.0308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1356.3491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1517.0675, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.23435225618631728
Sentence level Krippendorff's alpha for Premises:  0.26928675400291124
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 356
	Prediction sentences having premises: 321
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 134
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 424
	Sentences having claim in only one of reference or prediction: 263
	Sentences having premise in both reference and prediction: 436
	Sentences having premise in only one of reference or prediction: 251
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1634.7651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.4296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2479.6724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1777.9834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1198.1509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1440.7478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1437.7200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1289.9214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1301.5492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1240.1459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1278.2677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1040.2101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2064.5225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1820.4562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2267.6936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.9511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.1403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.3346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1230.7023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.9288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1224.4945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2041.4734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1204.6045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.3836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.4340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1520.8381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.1887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1803.6504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1906.8992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1100.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1225.5596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(930.0607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.0454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1577.9658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.6843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.8572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1011.7717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1402.2490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1753.0166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.5186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.4628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.5130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1998.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1465.1482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.0582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1121.9343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1075.6093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1576.4042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.4023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1884.7791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1146.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1676.1012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.1092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.7814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.3485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1101.5137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1285.3623, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.388646288209607
Sentence level Krippendorff's alpha for Premises:  0.34788937409024745
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 297
	Prediction sentences having premises: 268
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 90
	Prediction Sentence having neither claim nor premise: 212
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 477
	Sentences having claim in only one of reference or prediction: 210
	Sentences having premise in both reference and prediction: 463
	Sentences having premise in only one of reference or prediction: 224
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1318.4949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.7689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2014.5426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1511.5171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(991.2615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.5306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.3218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.6526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1024.8912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.3990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.8425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.5797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1755.7456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1491.2288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1951.1528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.2586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.9629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.2132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.8437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.5566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1761.6298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(787.1801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.5989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.7227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.0021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.3005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1435.9628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1547.9072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.8861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(991.4821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.8601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1446.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1335.0779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.7281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.3120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(927.8362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.8279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1424.1406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(636.7305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.3436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.9662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1616.8206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1113.1257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.6931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.1483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.8384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1063.1091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1321.8563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1387.4941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(948.4515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1308.6208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.3876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.0015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.7183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.9135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.6254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.1110, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.35371179039301315
Sentence level Krippendorff's alpha for Premises:  0.3682678311499272
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 301
	Prediction sentences having premises: 275
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 76
	Prediction Sentence having neither claim nor premise: 187
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 465
	Sentences having claim in only one of reference or prediction: 222
	Sentences having premise in both reference and prediction: 470
	Sentences having premise in only one of reference or prediction: 217
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(876.2729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.2049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1581.3518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1158.7816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.8749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.8353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.4642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(881.1465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.5355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.1168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.2797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.2292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1498.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1279.6355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1698.9359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.7650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.2883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.5673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.1206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.1902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.0270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1387.7126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.2270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.4765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.1715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.0069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.8201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.6877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.0735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.6338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(886.5447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.7002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.8274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.9327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.7651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.2379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.0349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1075.9403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1299.8901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.7876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.0815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.9961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.8110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.2175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.1233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.8879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.0994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.4189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.8951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1263.1754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.8850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.5582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.7441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.3464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.2166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.1148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.6923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.0521, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3857350800582242
Sentence level Krippendorff's alpha for Premises:  0.3740902474526928
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 318
	Prediction sentences having premises: 307
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 476
	Sentences having claim in only one of reference or prediction: 211
	Sentences having premise in both reference and prediction: 472
	Sentences having premise in only one of reference or prediction: 215
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(665.2256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.5369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.0934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.5156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.3867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.2039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.3882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.8694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.8947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.4894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(752.0140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.0453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.2986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1034.9755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1398.1270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.1113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.8402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.1351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.8267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.7061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.8318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1123.3547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.8358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.7430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.2842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.4871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.6870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1028.1804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1133.6832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(546.4757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.0302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(505.4758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.5404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(962.3066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.6232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.2998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.1437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.8794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(989.8893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.9129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.4574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.0480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.6506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.6027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.6115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.2394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.3500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1100.5508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(946.3832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.0386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.6226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.2547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.8192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.6987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.2911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.5048, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.35953420669577874
Sentence level Krippendorff's alpha for Premises:  0.3857350800582242
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 337
	Prediction sentences having premises: 213
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 64
	Prediction Sentence having neither claim nor premise: 201
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 467
	Sentences having claim in only one of reference or prediction: 220
	Sentences having premise in both reference and prediction: 476
	Sentences having premise in only one of reference or prediction: 211
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(554.7410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.1517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.9739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(837.8528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.9433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.6216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.5005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.3439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.1205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.8740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.4553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.8921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.8917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.8438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.4320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.3176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.0393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.8607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.8819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.9104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.2834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.1656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.7730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.8346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.8864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.9902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.2443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.9409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.5557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.1847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.7834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.3567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.0764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.1715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.9236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.8527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.3350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.0779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.3397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.8512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.1846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.9130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.8257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.8959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.4393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.7443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.9474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.4948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.2047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.6060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.9291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.5034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.7713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.4143, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3740902474526928
Sentence level Krippendorff's alpha for Premises:  0.4177583697234353
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 368
	Prediction sentences having premises: 260
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 93
	Prediction Sentence having neither claim nor premise: 152
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 472
	Sentences having claim in only one of reference or prediction: 215
	Sentences having premise in both reference and prediction: 487
	Sentences having premise in only one of reference or prediction: 200
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(381.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.6644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.1656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.7520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.9044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.0541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.2904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.9312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.2063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.3279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1463.4309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.6719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.9506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.9537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.7139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.7814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.1058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.0510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.4796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.8792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.7758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.9894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.8304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.6321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(802.9181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.9855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.2242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.5146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.5061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.4351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.9645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.0083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.2579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.5555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.8220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.8403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.1866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.5601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.9816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.2469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.3965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.7643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.4994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.4219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.3051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.3903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.0716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.9116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.7768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.2153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.2447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.2536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.2166, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4061135371179039
Sentence level Krippendorff's alpha for Premises:  0.4032023289665211
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 133
	Prediction sentences having premises: 497
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 120
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 483
	Sentences having claim in only one of reference or prediction: 204
	Sentences having premise in both reference and prediction: 482
	Sentences having premise in only one of reference or prediction: 205
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(541.5853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1614.9529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1164.5431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.7867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.0491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.4219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(745.3030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.2044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.0748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.8749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.7108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.9544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.2346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.3611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.1044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.2854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.5836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.3223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.0421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.2717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.3118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.0205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.7917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1787.3811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1253.1748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.1484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.6450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.7321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.5267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.9714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.7552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.3243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.1481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.2519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.5247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.1100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.6968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.0322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.1288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.3785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.0564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.1813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.3620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.7884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.1313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.8330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.6917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.5456, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33624454148471616
Sentence level Krippendorff's alpha for Premises:  0.3245997088791849
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 87
	Prediction sentences having premises: 550
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 36
	Prediction Sentence having neither claim nor premise: 86
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 459
	Sentences having claim in only one of reference or prediction: 228
	Sentences having premise in both reference and prediction: 455
	Sentences having premise in only one of reference or prediction: 232
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(404.7607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.4639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.6312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.1968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.0525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(774.5214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.2380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.6765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.0679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.9443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.2576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.0465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.0898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1664.2758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.6318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.4092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.1686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.2102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.6891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.8252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.5515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.9537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.7706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.4065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.3606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.3014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.4686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(999.6979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.5787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.2244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.2110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(545.2560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.7275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.5504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.2165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.3883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.8075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.6171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.6871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.6969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.0075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.3500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.4170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.9296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.6879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.9301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.8688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(781.8119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.2566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.7054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.6431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.9189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.8986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.8301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.3494, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44978165938864634
Sentence level Krippendorff's alpha for Premises:  0.42940320232896656
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 148
	Prediction sentences having premises: 434
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 168
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 498
	Sentences having claim in only one of reference or prediction: 189
	Sentences having premise in both reference and prediction: 491
	Sentences having premise in only one of reference or prediction: 196
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(351.7919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.8663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.6294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.4689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.5657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.1639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.0374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.5291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.2771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.6886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.6116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.0944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.5997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.5516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(709.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.1827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.0181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.2628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.8519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.3062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.0414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.3096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.2532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.9318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.0916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.9832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.1695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.9308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.9371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.0469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.2573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.2328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.7147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.0307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.0835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.7159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.4995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.0066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.3489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.0549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.1163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.0392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.0214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.5686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.5991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.8698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.0469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.6667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.9898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.4722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.8656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.4629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.5177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.9810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.9365, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.388646288209607
Sentence level Krippendorff's alpha for Premises:  0.42358078602620086
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 95
	Prediction sentences having premises: 410
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 32
	Prediction Sentence having neither claim nor premise: 214
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 477
	Sentences having claim in only one of reference or prediction: 210
	Sentences having premise in both reference and prediction: 489
	Sentences having premise in only one of reference or prediction: 198
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(188.2229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.9101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.6605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.9457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.6932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.2924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.8859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.2560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.3890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.3850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.7205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.1524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.6675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.9987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.9418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.1015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.3970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.7027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.4018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.6623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.9798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.1171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.6645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.4230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.6513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.6019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.7767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.7228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.8016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.7724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.8835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.5885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.4719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.1073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.5072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.2397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.0438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.5913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.1277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.0280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.3703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.6341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.9276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.7184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.1516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.4436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.7050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.6213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.3854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.7250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.1013, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3740902474526928
Sentence level Krippendorff's alpha for Premises:  0.45269286754002913
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 230
	Prediction sentences having premises: 426
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 90
	Prediction Sentence having neither claim nor premise: 121
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 472
	Sentences having claim in only one of reference or prediction: 215
	Sentences having premise in both reference and prediction: 499
	Sentences having premise in only one of reference or prediction: 188
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(157.3714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.3640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.7038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.8190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.0806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.3691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.0058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.2500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.1663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.3611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.8385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.9626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.0365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.8172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.2919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.0341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.5672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.4727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.2018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.0991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.5761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.9518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.5538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.1959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.0522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.8609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.2468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.3217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.7560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.2976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.8718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.3243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.3938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.8893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.3531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.0334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.2921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.7816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.9634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.9182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.5798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.3279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.2876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.2728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.4861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.3441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.5657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.1063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.7702, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24017467248908297
Sentence level Krippendorff's alpha for Premises:  0.3187772925764192
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 448
	Prediction sentences having premises: 196
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 126
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 426
	Sentences having claim in only one of reference or prediction: 261
	Sentences having premise in both reference and prediction: 453
	Sentences having premise in only one of reference or prediction: 234
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(341.9870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.0267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.2336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.8407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.6887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.1639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.8353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.8951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.1078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.1567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.1892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.8775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.6938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.1618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.7670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.1975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.1436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.0078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.4696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.5704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.0079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.1527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.7598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.8083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.7082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.4823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.8059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.9716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.8541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.4565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.8125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.1155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.4062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.9919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.2633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.4683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.8152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.4069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.3450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.5332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.0786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.4983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.4183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.8300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.0154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.4748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.8652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.9847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.1959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.0408, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3828238719068413
Sentence level Krippendorff's alpha for Premises:  0.44687045123726343
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 325
	Prediction sentences having premises: 322
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 104
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 475
	Sentences having claim in only one of reference or prediction: 212
	Sentences having premise in both reference and prediction: 497
	Sentences having premise in only one of reference or prediction: 190
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(109.9991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.1902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.0003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.3949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.4059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.2731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.9102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.5075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.0004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.3292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(628.8571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.1674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.4742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.6975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.0035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.4688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.2522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.1045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.7633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.1927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.9369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.1768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.3396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.7481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.0237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.9505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.0467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.4753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.5989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.9033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.9874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.0747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.4464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.0756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.0736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.4816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.2477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.5672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.9006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.2156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.1284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.0368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.9814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.0718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.7055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.8959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.4219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.7108, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3682678311499272
Sentence level Krippendorff's alpha for Premises:  0.44687045123726343
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 184
	Prediction sentences having premises: 410
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 65
	Prediction Sentence having neither claim nor premise: 158
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 470
	Sentences having claim in only one of reference or prediction: 217
	Sentences having premise in both reference and prediction: 497
	Sentences having premise in only one of reference or prediction: 190
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(92.0501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.1190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.8255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.6573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.4142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.6515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.3626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.1816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.6123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.7356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.5510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.6797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.4524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.7729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.4823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.4132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.3740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.8491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.4696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.3775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.4491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.5703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.5928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.4298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.1953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.8937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.1993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.3363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.9510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.3418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.0824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.3636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.9336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.8207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.7315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.5561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.1730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.3185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.4213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.6272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.7314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.2024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.3886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9046, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4002911208151383
Sentence level Krippendorff's alpha for Premises:  0.470160116448326
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 237
	Prediction sentences having premises: 390
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 142
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 481
	Sentences having claim in only one of reference or prediction: 206
	Sentences having premise in both reference and prediction: 505
	Sentences having premise in only one of reference or prediction: 182
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(67.9176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.7415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.4846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.1573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.8347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.8817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.5470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.1842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.6352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.1001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.3889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.9413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.3685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.5022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.2313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.8590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.8158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.9211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.7848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.3902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.6821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.1041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.4474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.8280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.1827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.1604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.4666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.8329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.8611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.1323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.2103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.9834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.0022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.7450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.8171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.9256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.9440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.0450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.5861, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.36244541484716153
Sentence level Krippendorff's alpha for Premises:  0.4672489082969432
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 230
	Prediction sentences having premises: 381
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 75
	Prediction Sentence having neither claim nor premise: 151
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 468
	Sentences having claim in only one of reference or prediction: 219
	Sentences having premise in both reference and prediction: 504
	Sentences having premise in only one of reference or prediction: 183
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(55.5519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.9197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.3900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.0225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.7028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.4722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.3958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.9284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.8854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.4446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.8017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.1830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.2954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.5075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.1356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.3237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.3416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.0444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.6277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.7569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.3116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.5599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.2875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.3247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.5384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.0106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.0604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.3010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.3889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.0585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.2663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.8908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.9916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.8221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.2207, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.34206695778748175
Sentence level Krippendorff's alpha for Premises:  0.4672489082969432
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 247
	Prediction sentences having premises: 377
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 461
	Sentences having claim in only one of reference or prediction: 226
	Sentences having premise in both reference and prediction: 504
	Sentences having premise in only one of reference or prediction: 183
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(46.5784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.7850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.8949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.3027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.1437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.5601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.1901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.6551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.8882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.5132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.1784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.0512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.5080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.3271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.7268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.0568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.9005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.6311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.5963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.7264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.7678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.9546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.7867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.7406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.2596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.2160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.2326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.9191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.8987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.3965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.7153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.5885, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3770014556040757
Sentence level Krippendorff's alpha for Premises:  0.4730713245997089
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 233
	Prediction sentences having premises: 377
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 70
	Prediction Sentence having neither claim nor premise: 147
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 473
	Sentences having claim in only one of reference or prediction: 214
	Sentences having premise in both reference and prediction: 506
	Sentences having premise in only one of reference or prediction: 181
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(36.7289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.8115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.0286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.3904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.3643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.3173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.6629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.4678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.0975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.9527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.2172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.2718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.3281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.8723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.0361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.5222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.7996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.4290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.8236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.3781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.6847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.4212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.1190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.6303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.1208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.6638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.1751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.3884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.4376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.9334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.3877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.8240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.2313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.7402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.9203, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3944687045123726
Sentence level Krippendorff's alpha for Premises:  0.4788937409024745
Additional attributes: 
	Total Sentences: 687
	Prediction setences having claims: 241
	Prediction sentences having premises: 377
	Reference setences having claims: 269
	Reference sentences having premises: 332


	Prediction Sentence having both claim and premise: 74
	Prediction Sentence having neither claim nor premise: 143
	Reference Sentence having both claim and premise: 69
	Reference Sentence having neither claim nor premise: 155


	Sentences having claim in both reference and prediction: 479
	Sentences having claim in only one of reference or prediction: 208
	Sentences having premise in both reference and prediction: 508
	Sentences having premise in only one of reference or prediction: 179
				 Metric computations: None
	Train size: 50 Test size: 50


		-------------RUN 1-----------
			------------EPOCH 1---------------
Loss:  tensor(2163.8689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2070.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2488.9722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1754.5116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1846.6431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2062.8276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1652.6421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.0273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1004.4931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1508.3103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1993.2822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2026.2122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.6935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1926.3054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2841.0454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1228.7494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1844.0203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1866.2517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.8189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(783.5111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1859.7380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2282.0164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2347.5994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1987.8521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.2532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1633.2562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1251.4666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1761.0029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1722.0551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1716.3716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2035.4751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2224.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1296.4365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1661.2008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2506.8257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1292.4098, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3063763608087092
Sentence level Krippendorff's alpha for Premises:  0.25142560912389844
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 7
	Prediction sentences having premises: 581
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 0
	Prediction Sentence having neither claim nor premise: 1341
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1260
	Sentences having claim in only one of reference or prediction: 669
	Sentences having premise in both reference and prediction: 1207
	Sentences having premise in only one of reference or prediction: 722
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1369.7842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1419.5365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1707.9360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1303.4282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1408.5902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1525.7380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1241.6342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.6553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1300.2179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1658.7738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1775.1691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(981.1368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1531.7943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2321.8267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(987.5553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.9417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.0212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.8808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.0339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1555.2465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1894.0500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.4653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1805.5055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.0701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1386.6938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1027.4460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.9658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1634.6562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1587.6825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1971.8733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1918.7611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.5623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.9274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2187.3687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1106.3542, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3198548470710213
Sentence level Krippendorff's alpha for Premises:  0.3727319854847071
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 32
	Prediction sentences having premises: 1004
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 1
	Prediction Sentence having neither claim nor premise: 894
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1273
	Sentences having claim in only one of reference or prediction: 656
	Sentences having premise in both reference and prediction: 1324
	Sentences having premise in only one of reference or prediction: 605
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1175.6770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.9086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.9061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1063.7765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.0427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1317.1382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1088.0485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.1880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.1292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1074.0303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1138.9263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1463.7944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1599.7656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.2036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.2936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.3226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.6689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1373.1868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.2781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.5588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.0047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1296.5521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1605.8060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1533.8225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1601.6443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.0713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.7388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(861.9492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.2166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1493.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1362.5541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1792.8994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1679.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.4929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1135.7367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.8771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(962.3492, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40798341109383096
Sentence level Krippendorff's alpha for Premises:  0.4173146708138932
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 339
	Prediction sentences having premises: 1077
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 86
	Prediction Sentence having neither claim nor premise: 599
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1358
	Sentences having claim in only one of reference or prediction: 571
	Sentences having premise in both reference and prediction: 1367
	Sentences having premise in only one of reference or prediction: 562
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(937.6398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.6899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1073.6560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.9955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.7017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.3545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1007.3678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.4636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.5538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.4587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.7130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.0974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1342.5198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.4954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.9716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1707.2244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.0291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.1753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.0732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.0253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.1869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1119.0247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1229.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1353.8953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.6426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(998.2251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.1843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1255.5454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1194.5066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1129.5461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.1096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.0400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.0939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(950.7808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1500.3086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.9457, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4473820632452048
Sentence level Krippendorff's alpha for Premises:  0.3810264385692068
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 283
	Prediction sentences having premises: 1286
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 91
	Prediction Sentence having neither claim nor premise: 451
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1396
	Sentences having claim in only one of reference or prediction: 533
	Sentences having premise in both reference and prediction: 1332
	Sentences having premise in only one of reference or prediction: 597
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(658.1216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.3580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.0278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.1338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.8373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.0492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.6588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.3993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.7532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.7784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.9908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.1250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1154.2698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.4717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(878.9301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1519.8810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.0943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.4363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(869.7263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.8077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.1834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(941.8965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.1140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1004.9780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.8209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.2333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.4590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.5935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.0941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1002.5652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.8961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1135.8669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1163.3657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.2676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.3654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.0615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.2123, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48367029548989116
Sentence level Krippendorff's alpha for Premises:  0.4245723172628305
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 340
	Prediction sentences having premises: 1114
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 105
	Prediction Sentence having neither claim nor premise: 580
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1431
	Sentences having claim in only one of reference or prediction: 498
	Sentences having premise in both reference and prediction: 1374
	Sentences having premise in only one of reference or prediction: 555
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(448.4199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.8030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.1970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.8517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.1726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.7009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.6105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.5488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.8603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.1238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.0375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(955.4059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1048.2678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.7230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.1699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1285.9471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.8851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.3323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.7398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.0082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.5635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.4506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.7665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(873.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.5522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.9047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.8434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.0410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.2729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.5292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(671.7312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.7148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.9698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.6420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.3971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.9509, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41005702436495595
Sentence level Krippendorff's alpha for Premises:  0.40902021772939345
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 129
	Prediction sentences having premises: 1227
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 26
	Prediction Sentence having neither claim nor premise: 599
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1360
	Sentences having claim in only one of reference or prediction: 569
	Sentences having premise in both reference and prediction: 1359
	Sentences having premise in only one of reference or prediction: 570
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(321.5081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.5775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.2347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.1497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.6687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.1549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(487.1113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.2910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.0505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.3423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.3561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.1086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(912.8195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.4318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.0561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1372.2051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.0486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.6297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.6707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.7139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.0265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.4148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.4447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.6465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.2251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.1173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.4352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1016.8374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.7754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.5787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.0498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(898.6808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(505.6519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.3728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.5878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.2631, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44219803006739244
Sentence level Krippendorff's alpha for Premises:  0.37584240539139446
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 254
	Prediction sentences having premises: 1271
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 84
	Prediction Sentence having neither claim nor premise: 488
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1391
	Sentences having claim in only one of reference or prediction: 538
	Sentences having premise in both reference and prediction: 1327
	Sentences having premise in only one of reference or prediction: 602
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(242.1984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.3134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.5573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.2747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.1516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.5580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.4319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.7932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.3302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.6819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.9723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1477.7620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.3650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.3729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1504.8816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.2821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.7778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.8213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.9875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.1522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.1380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1130.0131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(939.4231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.2155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.2146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.0416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.3815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.2040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.9390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.2909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.8896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.8177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.8619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.2590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.3605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(624.3704, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.18714359771902545
Sentence level Krippendorff's alpha for Premises:  0.28045619491964746
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 1244
	Prediction sentences having premises: 543
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 157
	Prediction Sentence having neither claim nor premise: 299
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1145
	Sentences having claim in only one of reference or prediction: 784
	Sentences having premise in both reference and prediction: 1235
	Sentences having premise in only one of reference or prediction: 694
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(560.1019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(490.8245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.4452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.4502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.4971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.7343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.2955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.8754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.0359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.9624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.6481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(850.0308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.3415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1845.2245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.0788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.6234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.0820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.7354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.2933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.8354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.7044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.3008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.3453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.8172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.9684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.0214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.6565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.5948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.9094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.3016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.3731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.5615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.0266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.3584, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32503888024883354
Sentence level Krippendorff's alpha for Premises:  0.2452047693105236
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 767
	Prediction sentences having premises: 307
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 96
	Prediction Sentence having neither claim nor premise: 951
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1278
	Sentences having claim in only one of reference or prediction: 651
	Sentences having premise in both reference and prediction: 1201
	Sentences having premise in only one of reference or prediction: 728
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(725.6924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.5136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.4800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.5007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.8983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(824.0533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(671.0372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.3363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.6597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.7498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.0979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.3529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.7849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.3378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.9772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.4075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(478.6132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.9449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.8400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.9605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.6517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.9995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.2720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.8242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.5131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.0982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.6103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1018.7338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.9167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.4647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1141.4709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.9967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.3441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1232.8458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.4335, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42975635044064286
Sentence level Krippendorff's alpha for Premises:  0.4048729911871436
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 666
	Prediction sentences having premises: 1111
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 200
	Prediction Sentence having neither claim nor premise: 352
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1379
	Sentences having claim in only one of reference or prediction: 550
	Sentences having premise in both reference and prediction: 1355
	Sentences having premise in only one of reference or prediction: 574
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(299.7421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.4505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.5494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.2992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.6313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.1222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.3586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.4675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.0026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.4253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.4770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.0051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.8004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.4620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.4766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.0443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.7359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.6878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.6256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.3398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.3556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.2109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.5771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.1245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.9563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.9937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.1327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.4794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.0429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.4949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.7432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.0809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.5540, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.456713322965267
Sentence level Krippendorff's alpha for Premises:  0.4038361845515811
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 556
	Prediction sentences having premises: 1206
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 200
	Prediction Sentence having neither claim nor premise: 367
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1405
	Sentences having claim in only one of reference or prediction: 524
	Sentences having premise in both reference and prediction: 1354
	Sentences having premise in only one of reference or prediction: 575
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(183.9118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.1173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.5057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.0169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.9524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.4859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.5682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.7674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.4579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.3680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.1999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.0881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.1396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.4414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.8926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.0276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.6308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.9204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.6584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.4776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.2567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.9362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.5242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.0348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.2002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.9211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.6055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.1695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.9955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.8936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.7185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.1654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.2990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.8213, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4121306376360808
Sentence level Krippendorff's alpha for Premises:  0.4048729911871436
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 841
	Prediction sentences having premises: 923
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 228
	Prediction Sentence having neither claim nor premise: 393
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1362
	Sentences having claim in only one of reference or prediction: 567
	Sentences having premise in both reference and prediction: 1355
	Sentences having premise in only one of reference or prediction: 574
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(137.7172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.5648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.4807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.7981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.7253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.9664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.8042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.1942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.8831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.8725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.8618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.7485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.4660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.6129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.9089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.5018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.3051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.8803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.1191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.3499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.0175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.4043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.2224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.2987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.4812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.9807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.0719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.5811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.9932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.1472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.1059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.7369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.9489, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4380508035251426
Sentence level Krippendorff's alpha for Premises:  0.3965785381026439
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 618
	Prediction sentences having premises: 1169
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 221
	Prediction Sentence having neither claim nor premise: 363
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1387
	Sentences having claim in only one of reference or prediction: 542
	Sentences having premise in both reference and prediction: 1347
	Sentences having premise in only one of reference or prediction: 582
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(97.5275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.7208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.2652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.6825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.4454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.4837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.5821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.6878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.8372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.2990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.7506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.8942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.7953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.4276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.5171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.7411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.4166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.0920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.0227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.4450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.6281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.3186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.3656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.9824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.8552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.8733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.3950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.0780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.1767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.0695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.6141, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42146189735614303
Sentence level Krippendorff's alpha for Premises:  0.40902021772939345
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 774
	Prediction sentences having premises: 1023
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 242
	Prediction Sentence having neither claim nor premise: 374
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1371
	Sentences having claim in only one of reference or prediction: 558
	Sentences having premise in both reference and prediction: 1359
	Sentences having premise in only one of reference or prediction: 570
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(84.1859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.4988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.5608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.0196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.0160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.2111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.4923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.1187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.8682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.9232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.3552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.7530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.2084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.0426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.7889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.9416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.5268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.7433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.6711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.6391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.4287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.2307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.9747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.0609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.7922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.2368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.1391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.8956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.2726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.5603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.5659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.3907, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.423535510627268
Sentence level Krippendorff's alpha for Premises:  0.4038361845515811
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 722
	Prediction sentences having premises: 1064
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 222
	Prediction Sentence having neither claim nor premise: 365
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1373
	Sentences having claim in only one of reference or prediction: 556
	Sentences having premise in both reference and prediction: 1354
	Sentences having premise in only one of reference or prediction: 575
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(53.3910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.9189, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.3491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.6381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.2589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.9672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.5200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.3149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.5651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.8325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.1181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.7134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.7596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.6776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.0471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.4764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.8276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.7812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.0923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.1257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.9934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.1559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.8427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.3483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.0233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.2867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.9252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.5740, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42146189735614303
Sentence level Krippendorff's alpha for Premises:  0.4142042509072058
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 730
	Prediction sentences having premises: 1070
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 240
	Prediction Sentence having neither claim nor premise: 369
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1371
	Sentences having claim in only one of reference or prediction: 558
	Sentences having premise in both reference and prediction: 1364
	Sentences having premise in only one of reference or prediction: 565
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(44.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.7467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.4685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.5686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.3431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.3111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.1582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.9981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.0091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.7237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.5499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.0563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.9269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.7194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.6019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.2934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.4734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.8579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.6276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.2040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.9336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.8324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.2220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.5457, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42042509072058065
Sentence level Krippendorff's alpha for Premises:  0.4069466044582686
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 767
	Prediction sentences having premises: 1047
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 230
	Prediction Sentence having neither claim nor premise: 345
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1370
	Sentences having claim in only one of reference or prediction: 559
	Sentences having premise in both reference and prediction: 1357
	Sentences having premise in only one of reference or prediction: 572
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(32.9150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.0157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.7709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.8865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.2673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.3378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.5900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.4841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.4606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.0263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.3939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.7000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.8554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.8184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.7705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.6791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.9367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.4030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.5986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.9866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.3393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.7335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.0603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.8030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.7228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.4294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.9345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.2040, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42042509072058065
Sentence level Krippendorff's alpha for Premises:  0.4162778641783308
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 685
	Prediction sentences having premises: 1088
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 227
	Prediction Sentence having neither claim nor premise: 383
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1370
	Sentences having claim in only one of reference or prediction: 559
	Sentences having premise in both reference and prediction: 1366
	Sentences having premise in only one of reference or prediction: 563
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(35.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.1656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.8544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.3389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.3507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.7812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.0904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.0173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.2589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.2960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.2126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.9388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.3560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.3935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.2774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.9904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.6641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.9426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.5746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.4325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.8622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.4354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.2250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.0136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.0941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.4636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.6844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.7783, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.39035769828926903
Sentence level Krippendorff's alpha for Premises:  0.3965785381026439
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 812
	Prediction sentences having premises: 1025
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 250
	Prediction Sentence having neither claim nor premise: 342
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1341
	Sentences having claim in only one of reference or prediction: 588
	Sentences having premise in both reference and prediction: 1347
	Sentences having premise in only one of reference or prediction: 582
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(24.6486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.2758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.7150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.0621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.2977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.3181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.7489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.6027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.9453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.3859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.9830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.0885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.2056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.7271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.0561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.8933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.5354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.9218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.5606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.6782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.7752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.3574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.7793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.9279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.2491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.3285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.0097, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42975635044064286
Sentence level Krippendorff's alpha for Premises:  0.4027993779160186
Additional attributes: 
	Total Sentences: 1929
	Prediction setences having claims: 682
	Prediction sentences having premises: 1117
	Reference setences having claims: 676
	Reference sentences having premises: 833


	Prediction Sentence having both claim and premise: 235
	Prediction Sentence having neither claim nor premise: 365
	Reference Sentence having both claim and premise: 152
	Reference Sentence having neither claim nor premise: 572


	Sentences having claim in both reference and prediction: 1379
	Sentences having claim in only one of reference or prediction: 550
	Sentences having premise in both reference and prediction: 1353
	Sentences having premise in only one of reference or prediction: 576
				 Metric computations: None


		-------------RUN 2-----------
			------------EPOCH 1---------------
Loss:  tensor(3079.9949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3525.9839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2662.0227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1582.1560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.2905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2613.8193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2233.7583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2399.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2165.3625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2987.0061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2869.0864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2688.8818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2266.5742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.3818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.3312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.2650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.0936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2051.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2708.4290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2788.8562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.3525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1618.8716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1518.3026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1653.0161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2240.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.3722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2655.8025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2105.3967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2487.2803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2319.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2105.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2847.3560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3560.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2567.3247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2614.1418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1157.1707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2023.2166, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2550761421319797
Sentence level Krippendorff's alpha for Premises:  0.20558375634517767
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 449
	Prediction sentences having premises: 982
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 348
	Prediction Sentence having neither claim nor premise: 493
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 989
	Sentences having claim in only one of reference or prediction: 587
	Sentences having premise in both reference and prediction: 950
	Sentences having premise in only one of reference or prediction: 626
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2378.0010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2135.7354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1743.3895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1167.3563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1289.3240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2171.5706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1879.6852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1881.2368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1853.2717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2446.4321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2478.7158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2169.9741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1865.6266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(829.4656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1504.3035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.0623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.3101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1660.8054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2112.7925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2155.1487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1517.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1372.2010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.4454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.6871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2041.0083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.1313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2518.4492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2086.9768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2335.7261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2089.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2068.3569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2775.5713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3224.5815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2314.3367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2377.7988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.5591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1820.5861, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.315989847715736
Sentence level Krippendorff's alpha for Premises:  0.32360406091370564
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 601
	Prediction sentences having premises: 1005
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 408
	Prediction Sentence having neither claim nor premise: 378
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1037
	Sentences having claim in only one of reference or prediction: 539
	Sentences having premise in both reference and prediction: 1043
	Sentences having premise in only one of reference or prediction: 533
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(2111.2917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.3047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1547.6458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1077.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1226.5481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2134.5820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.5509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1647.7522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1822.5972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2322.6152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2449.3867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2069.0947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1755.5594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.7063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1459.6228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.6910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1578.2856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2130.1416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2227.5122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1516.0220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1267.4709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1237.6714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1251.0364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1860.1216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.5079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2079.7612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1551.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2115.6577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1937.3776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1754.8467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2473.9905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2944.1467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2172.8013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2251.4692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.9596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1794.1360, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3692893401015228
Sentence level Krippendorff's alpha for Premises:  0.32360406091370564
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 713
	Prediction sentences having premises: 829
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 401
	Prediction Sentence having neither claim nor premise: 435
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1079
	Sentences having claim in only one of reference or prediction: 497
	Sentences having premise in both reference and prediction: 1043
	Sentences having premise in only one of reference or prediction: 533
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(2030.8875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1699.6672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1396.2922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.6238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1039.8479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1872.7439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1498.3251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.4230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1496.6842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1990.5011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1985.6532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1856.1667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1631.4590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.1132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.0374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.4941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.6115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.1011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1839.3274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1795.4701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.5491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1115.1973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.0671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.7128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1731.9668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.1578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1704.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1246.3055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1899.4519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1456.3818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1445.0282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2083.0337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2622.7202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1859.6117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1872.9650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.2154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.2708, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.36040609137055835
Sentence level Krippendorff's alpha for Premises:  0.33121827411167515
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 860
	Prediction sentences having premises: 713
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 319
	Prediction Sentence having neither claim nor premise: 322
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1072
	Sentences having claim in only one of reference or prediction: 504
	Sentences having premise in both reference and prediction: 1049
	Sentences having premise in only one of reference or prediction: 527
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1774.2004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.2991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.3420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.6483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.6386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1772.6770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.3477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.8234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1341.0610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1615.4224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.9619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.7957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1432.9451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.6401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1002.2989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.3122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.8423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1376.1924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1691.7397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1492.1675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.8000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.0778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.4532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.7153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1606.3395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.2248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1340.0012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.1620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1675.5509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1172.4983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1199.9059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1767.9091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2242.5640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1511.5454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1609.6978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(802.4381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1373.7920, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2766497461928934
Sentence level Krippendorff's alpha for Premises:  0.33375634517766495
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 998
	Prediction sentences having premises: 439
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 214
	Prediction Sentence having neither claim nor premise: 353
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1006
	Sentences having claim in only one of reference or prediction: 570
	Sentences having premise in both reference and prediction: 1051
	Sentences having premise in only one of reference or prediction: 525
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1627.4519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1201.4622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.0609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.8178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(735.1987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1610.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1061.9021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1101.4460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1269.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.2950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1204.5874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.1317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.0536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.5242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.5683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.3488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.7202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1485.4062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1229.5990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.5728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.5681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1019.6666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1528.3550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.4109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.8223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.0481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1510.4893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.3793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.3157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1630.3441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1870.9279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1287.4260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1370.2771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.2823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.8508, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.133248730964467
Sentence level Krippendorff's alpha for Premises:  0.2817258883248731
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 1141
	Prediction sentences having premises: 228
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 135
	Prediction Sentence having neither claim nor premise: 342
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 893
	Sentences having claim in only one of reference or prediction: 683
	Sentences having premise in both reference and prediction: 1010
	Sentences having premise in only one of reference or prediction: 566
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(1459.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.7585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(939.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.0535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(663.6692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1689.2726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.5057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.7010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1475.0989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1510.6451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.3706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.2527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.0554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.7907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.3564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.2181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.2534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1200.0332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1151.9113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.0284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.1625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.6237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.9382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1363.4875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.0249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.4751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(752.3502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1647.4946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.6718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.3053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1887.0261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2396.2639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1642.3418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1683.9294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(898.1310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.3066, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44543147208121825
Sentence level Krippendorff's alpha for Premises:  0.3032994923857868
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 445
	Prediction sentences having premises: 1097
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 252
	Prediction Sentence having neither claim nor premise: 286
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1139
	Sentences having claim in only one of reference or prediction: 437
	Sentences having premise in both reference and prediction: 1027
	Sentences having premise in only one of reference or prediction: 549
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(1499.7759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1299.8081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1003.3141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.0417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(628.5444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.7791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(824.3367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.2635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(995.6599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.0942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(998.5098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.3469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(756.6561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.4595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.4426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1116.5100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1365.9485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.1547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.8419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.3860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.2211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.3434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.3717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1749.7063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1433.3789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1742.6740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.9449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.4241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1517.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1691.7556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.6206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1450.2168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.6282, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4365482233502538
Sentence level Krippendorff's alpha for Premises:  0.3743654822335025
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 548
	Prediction sentences having premises: 933
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 267
	Prediction Sentence having neither claim nor premise: 362
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1132
	Sentences having claim in only one of reference or prediction: 444
	Sentences having premise in both reference and prediction: 1083
	Sentences having premise in only one of reference or prediction: 493
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(1059.8905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.1426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.7474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.8160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.9918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.0024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.4886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.0958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.7672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(970.2809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.6899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.0764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(971.5051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.2809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.4618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.4697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.2356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.9235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.4695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.5498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.9803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.7635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.3004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.8871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.0399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.8167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.2672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.7669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.3064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.7433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1261.6160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.3533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.5666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.1105, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44289340101522845
Sentence level Krippendorff's alpha for Premises:  0.37944162436548223
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 615
	Prediction sentences having premises: 769
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 238
	Prediction Sentence having neither claim nor premise: 430
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1137
	Sentences having claim in only one of reference or prediction: 439
	Sentences having premise in both reference and prediction: 1087
	Sentences having premise in only one of reference or prediction: 489
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(864.3118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.0541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.4922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.6948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.0504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(913.8082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.5667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.6604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.3512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.1038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.2295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.5852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.5549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.6131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.5971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.6293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.8136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.7341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.8499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.7641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.8502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.3820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.3852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.9822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.2285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.1865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.1472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.7250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.4378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.3374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.5741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(972.7838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.3182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.3581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.7910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.6315, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4682741116751269
Sentence level Krippendorff's alpha for Premises:  0.3565989847715736
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 447
	Prediction sentences having premises: 993
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 199
	Prediction Sentence having neither claim nor premise: 335
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1157
	Sentences having claim in only one of reference or prediction: 419
	Sentences having premise in both reference and prediction: 1069
	Sentences having premise in only one of reference or prediction: 507
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(746.3043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.4529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.4520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.9514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.9507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.5584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.5851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.9959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.7868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.1395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.4901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.6204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.7261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.1113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.9687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.5458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.5597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.6058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.8321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.6867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.9522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.6720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.3733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.1062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.1259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.7939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.6183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.8192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.1223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.6016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.8748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.0717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.9661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.4628, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.36548223350253806
Sentence level Krippendorff's alpha for Premises:  0.38705583756345174
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 812
	Prediction sentences having premises: 767
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 224
	Prediction Sentence having neither claim nor premise: 221
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1076
	Sentences having claim in only one of reference or prediction: 500
	Sentences having premise in both reference and prediction: 1093
	Sentences having premise in only one of reference or prediction: 483
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(625.1244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.0817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.2570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.3509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.2701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.4918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.6446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.2012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.3033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.7065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.3736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.7010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.4295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.3407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(340.4432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.8662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.1308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.6392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.8980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.7564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.3473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.3146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.5052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.6360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.3695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.6099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.0346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.8187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.5736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.4265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.8117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.0668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.5044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.7462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.3940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.9856, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41624365482233505
Sentence level Krippendorff's alpha for Premises:  0.38451776649746194
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 644
	Prediction sentences having premises: 929
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 223
	Prediction Sentence having neither claim nor premise: 226
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1116
	Sentences having claim in only one of reference or prediction: 460
	Sentences having premise in both reference and prediction: 1091
	Sentences having premise in only one of reference or prediction: 485
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(464.3607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.0096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.1873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.0476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.4222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.1787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.8239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.6481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.4901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.8063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.3425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.6927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.6963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.0788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.6906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.2743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.4354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.8781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.7035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.6151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.4734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.4733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.1685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.1293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.0522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.2066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.8282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.0924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.2354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.2834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.9328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.3291, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43527918781725883
Sentence level Krippendorff's alpha for Premises:  0.38705583756345174
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 493
	Prediction sentences having premises: 889
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 183
	Prediction Sentence having neither claim nor premise: 377
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1131
	Sentences having claim in only one of reference or prediction: 445
	Sentences having premise in both reference and prediction: 1093
	Sentences having premise in only one of reference or prediction: 483
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(790.8032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.2881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.2589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.7538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.6098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.7860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.5702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.8927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.6462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.6870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.8088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.1587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.3887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.5501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.5419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.3303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.1251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.1802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.6959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.4159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.2326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.6960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.8769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.3185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.1548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.5815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.6151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.7332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1014.5553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.5153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.3877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.8296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.6080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.1753, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40862944162436543
Sentence level Krippendorff's alpha for Premises:  0.40862944162436543
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 768
	Prediction sentences having premises: 730
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 223
	Prediction Sentence having neither claim nor premise: 301
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1110
	Sentences having claim in only one of reference or prediction: 466
	Sentences having premise in both reference and prediction: 1110
	Sentences having premise in only one of reference or prediction: 466
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(604.9365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.8474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.7199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.8942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.0606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.5925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.2867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.0564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.7216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.0620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.9705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.2264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.4102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.1179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.5206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.8060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.9564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.6725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.3520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.0240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.4229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.4447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.0925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.8777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.3400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.1591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.1002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.5222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.1288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.2885, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4314720812182741
Sentence level Krippendorff's alpha for Premises:  0.38451776649746194
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 492
	Prediction sentences having premises: 971
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 189
	Prediction Sentence having neither claim nor premise: 302
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1128
	Sentences having claim in only one of reference or prediction: 448
	Sentences having premise in both reference and prediction: 1091
	Sentences having premise in only one of reference or prediction: 485
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(343.3204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.2910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.9553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.0741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.6428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.0684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.9862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.8756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.0755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.8966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.8258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.8229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.8736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.2676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.4285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.1245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.2420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.9447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.5740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.8825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.7835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.3498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.7441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.7529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.2308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.1868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.2852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.8665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.5810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.1693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.0984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.0375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.1777, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44035532994923854
Sentence level Krippendorff's alpha for Premises:  0.4276649746192893
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 625
	Prediction sentences having premises: 841
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 213
	Prediction Sentence having neither claim nor premise: 323
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1135
	Sentences having claim in only one of reference or prediction: 441
	Sentences having premise in both reference and prediction: 1125
	Sentences having premise in only one of reference or prediction: 451
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(238.9908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.6099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.9221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.3306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.9388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.6934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.9282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.6753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.0365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.5872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.1552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.1164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.1124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.2366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.8566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.5465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.5320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.1418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.2874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.6450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.2168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.4977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.8742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.4039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.5722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.3803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.6339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.2933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.1083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.0995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.3587, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44543147208121825
Sentence level Krippendorff's alpha for Premises:  0.4010152284263959
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 629
	Prediction sentences having premises: 854
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 211
	Prediction Sentence having neither claim nor premise: 304
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1139
	Sentences having claim in only one of reference or prediction: 437
	Sentences having premise in both reference and prediction: 1104
	Sentences having premise in only one of reference or prediction: 472
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(198.4053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.1704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.7143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.9603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.5116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.4640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.0318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.7255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.7608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.5713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.7901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.4821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.4994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.6071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.7938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.8914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.9810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.9630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.2810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.9521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.0101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.1566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.3757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.5618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.5162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.4601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.7102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.4968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.9049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.5104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.9219, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43527918781725883
Sentence level Krippendorff's alpha for Premises:  0.3984771573604061
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 539
	Prediction sentences having premises: 920
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 304
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1131
	Sentences having claim in only one of reference or prediction: 445
	Sentences having premise in both reference and prediction: 1102
	Sentences having premise in only one of reference or prediction: 474
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(207.9277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.6551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.3448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.2612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.7152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.4643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.3438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.4797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.2401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.9370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.7419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.7246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.5611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.1093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.7235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.9951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.4507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.3810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.4174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.1488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.3404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.0764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.6484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.0491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.7123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.3928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.7547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.3992, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4517766497461929
Sentence level Krippendorff's alpha for Premises:  0.4048223350253807
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 598
	Prediction sentences having premises: 893
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 218
	Prediction Sentence having neither claim nor premise: 303
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1144
	Sentences having claim in only one of reference or prediction: 432
	Sentences having premise in both reference and prediction: 1107
	Sentences having premise in only one of reference or prediction: 469
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(231.2705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.3955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.1243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.3856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.1752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.1527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.0335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.4813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.8945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.0678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.8696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.4498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.9918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.7805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.0058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.7266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.8752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.0127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.7819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.3933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.2485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.7009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.5157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.4079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.3788, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43020304568527923
Sentence level Krippendorff's alpha for Premises:  0.4073604060913706
Additional attributes: 
	Total Sentences: 1576
	Prediction setences having claims: 651
	Prediction sentences having premises: 831
	Reference setences having claims: 590
	Reference sentences having premises: 654


	Prediction Sentence having both claim and premise: 212
	Prediction Sentence having neither claim nor premise: 306
	Reference Sentence having both claim and premise: 143
	Reference Sentence having neither claim nor premise: 475


	Sentences having claim in both reference and prediction: 1127
	Sentences having claim in only one of reference or prediction: 449
	Sentences having premise in both reference and prediction: 1109
	Sentences having premise in only one of reference or prediction: 467
				 Metric computations: None


		-------------RUN 3-----------
			------------EPOCH 1---------------
Loss:  tensor(2026.2385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1445.2545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2999.5366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1613.5278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2531.5667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2370.6096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2107.7773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1885.0664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1250.9768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1363.4478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2716.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1215.1056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2236.7651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2538.2134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2683.1792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3064.1646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3170.9683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1855.9343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2387.0327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1660.7972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1946.0261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1803.9788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1160.4497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2262.3447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2109.1399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.7314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.7621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.7089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1498.0417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(556.0363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2243.2463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1593.8479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1730.5044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1842.6873, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24375995751460433
Sentence level Krippendorff's alpha for Premises:  0.168348380244291
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 11
	Prediction sentences having premises: 1300
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 5
	Prediction Sentence having neither claim nor premise: 577
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1171
	Sentences having claim in only one of reference or prediction: 712
	Sentences having premise in both reference and prediction: 1100
	Sentences having premise in only one of reference or prediction: 783
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1404.3138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.9607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2083.1431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1186.1277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2130.0403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2060.8762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1719.3589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1601.9056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1111.0546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1200.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.5713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.0063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1778.1470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2094.6948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2278.2920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2765.4741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2827.1509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1624.3259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2204.1938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1434.3242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1581.1958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1556.9775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1014.2797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2096.9045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1905.4761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(936.1014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.6535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.1916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1306.0264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.7590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1902.6912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1375.0032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1516.5486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1675.2152, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3127987254381307
Sentence level Krippendorff's alpha for Premises:  0.2851832182687202
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 350
	Prediction sentences having premises: 1242
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 123
	Prediction Sentence having neither claim nor premise: 414
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1236
	Sentences having claim in only one of reference or prediction: 647
	Sentences having premise in both reference and prediction: 1210
	Sentences having premise in only one of reference or prediction: 673
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1128.6931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1774.0415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.2034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1981.9634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1863.1056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.1306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.4910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.4051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.0337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1939.9429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(946.2916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1483.3269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1819.5654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1961.1626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2458.7930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2506.6787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.7205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2080.2246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1231.7002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1285.2340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1325.8545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(864.2610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1896.2513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1733.5388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.3959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(831.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.2697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1147.4043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.8663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1624.0945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.9342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1397.4307, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3605947955390335
Sentence level Krippendorff's alpha for Premises:  0.3181093998937865
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 601
	Prediction sentences having premises: 1099
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 231
	Prediction Sentence having neither claim nor premise: 414
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1281
	Sentences having claim in only one of reference or prediction: 602
	Sentences having premise in both reference and prediction: 1241
	Sentences having premise in only one of reference or prediction: 642
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(910.5130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.6412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1545.8948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1003.2222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1735.6052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1535.9363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1356.2695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1199.9897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.9852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.8828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.6848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.3159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.4578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.6299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1590.9507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1610.6711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2114.2104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2156.4785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1178.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1865.7053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.3628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.3596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1141.9758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.8459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1640.8021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1478.9966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.0739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.2026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.4180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.1688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.5700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1413.3435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(999.3951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1077.3573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.5006, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3584705257567711
Sentence level Krippendorff's alpha for Premises:  0.37015400955921407
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 629
	Prediction sentences having premises: 1020
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 208
	Prediction Sentence having neither claim nor premise: 442
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1279
	Sentences having claim in only one of reference or prediction: 604
	Sentences having premise in both reference and prediction: 1290
	Sentences having premise in only one of reference or prediction: 593
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(741.6241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.1008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1349.9719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.9715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1427.3434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1254.7062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1202.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.8378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.6886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.4433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.1042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1480.3860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.0096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.5104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1300.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1228.6013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1783.9250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1608.9236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.1873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1544.0627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.3406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.9488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.2390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(626.7919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1361.9805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1211.0911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.1108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.7696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.1245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.2001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.5665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.3456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.2222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.7652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.8049, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3659054699946893
Sentence level Krippendorff's alpha for Premises:  0.3489113117365905
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 556
	Prediction sentences having premises: 1122
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 190
	Prediction Sentence having neither claim nor premise: 395
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1286
	Sentences having claim in only one of reference or prediction: 597
	Sentences having premise in both reference and prediction: 1270
	Sentences having premise in only one of reference or prediction: 613
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(562.2692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.4114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1115.4556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.7777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1056.4805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.2779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.2974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.7903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1254.9424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.9453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.1641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.2380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.6870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1551.6604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1471.7776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.4785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1310.3838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.3160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.8366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.9977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.9360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.4698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(979.3218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(382.8639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.9566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.7841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(644.4230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.1303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.6980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.1222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.8197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.2038, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42963356346255976
Sentence level Krippendorff's alpha for Premises:  0.3319171534784918
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 498
	Prediction sentences having premises: 1212
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 183
	Prediction Sentence having neither claim nor premise: 356
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1346
	Sentences having claim in only one of reference or prediction: 537
	Sentences having premise in both reference and prediction: 1254
	Sentences having premise in only one of reference or prediction: 629
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(394.2573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.8229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.6869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.0521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.5361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.2676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.4738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.3433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.6044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.2673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.2450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.9490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.8685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(990.4503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.2610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1225.6689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.5106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(914.1721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.9863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.0530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(545.2183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.8601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.4109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.1355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.0325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.3961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.6611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.8378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1143.4294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.9135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.3212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.7817, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3998937865108869
Sentence level Krippendorff's alpha for Premises:  0.3096123207647371
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 362
	Prediction sentences having premises: 1293
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 136
	Prediction Sentence having neither claim nor premise: 364
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1318
	Sentences having claim in only one of reference or prediction: 565
	Sentences having premise in both reference and prediction: 1233
	Sentences having premise in only one of reference or prediction: 650
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(285.4871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.5789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.6104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.0610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(861.6052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.5409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.5806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.2275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.5147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.1819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.8398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.6621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.1590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.4904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.1324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.0474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.4312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.8636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.7815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.4816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.4653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.3513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.3677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.1823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(729.3470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.2444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.0520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.4628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.9630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.8619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.4162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.2856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.7618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.0419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(940.8613, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3839617631439193
Sentence level Krippendorff's alpha for Premises:  0.42644715878916617
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 201
	Prediction sentences having premises: 1085
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 78
	Prediction Sentence having neither claim nor premise: 675
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1303
	Sentences having claim in only one of reference or prediction: 580
	Sentences having premise in both reference and prediction: 1343
	Sentences having premise in only one of reference or prediction: 540
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(676.2405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.8324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.7484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.0759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.0385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.0508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.0388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.1534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.7401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.5001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1031.7437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.1635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.4633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(920.8782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.2039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.0898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.0187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(498.6144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.9791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.7852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.1778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1243.5378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.9900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.1323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.0274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(888.6731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.9203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.1417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(787.8598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1145.3472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.9128, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2501327668613914
Sentence level Krippendorff's alpha for Premises:  0.37971322357939463
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 1243
	Prediction sentences having premises: 589
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 222
	Prediction Sentence having neither claim nor premise: 273
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1177
	Sentences having claim in only one of reference or prediction: 706
	Sentences having premise in both reference and prediction: 1299
	Sentences having premise in only one of reference or prediction: 584
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(407.2385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.1505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.9423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.4213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.0447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.7389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(933.9362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.2421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.5030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.8113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.7065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.5402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.9543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.4147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.3480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1258.4480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1113.4897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.4097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.0133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.0210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.4493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.5936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.6470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.6736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.2795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.8969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.4045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.4201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.5025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.1094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.3740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.2890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.4464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.6595, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3807753584705258
Sentence level Krippendorff's alpha for Premises:  0.45193839617631437
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 840
	Prediction sentences having premises: 787
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 266
	Prediction Sentence having neither claim nor premise: 522
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1300
	Sentences having claim in only one of reference or prediction: 583
	Sentences having premise in both reference and prediction: 1367
	Sentences having premise in only one of reference or prediction: 516
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(252.5383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.1352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.7701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.9128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.3019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.0355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(981.2855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.8160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.4144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.5483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1348.2388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.7694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.7593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.4444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.4220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1067.9539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.3781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.4707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.5410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.2831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.2834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.2975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.1187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.9741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(505.0440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.6939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.3116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.7884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.3888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.4942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.7583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.8546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.5712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.2133, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4455655868295274
Sentence level Krippendorff's alpha for Premises:  0.3244822092405736
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 435
	Prediction sentences having premises: 1315
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 193
	Prediction Sentence having neither claim nor premise: 326
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1361
	Sentences having claim in only one of reference or prediction: 522
	Sentences having premise in both reference and prediction: 1247
	Sentences having premise in only one of reference or prediction: 636
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(264.7195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.2337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.4418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(433.9705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.1940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.8228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.7722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.2811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.2494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.1245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.1296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.7402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.4890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.5061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.9588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.3932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.5952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(864.7161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.7327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.1352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.3023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.8434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.5899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.3182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.3956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.1670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.0860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.2238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.3800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.9832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.5464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.1472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.8182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.0688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.6508, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3775889537971322
Sentence level Krippendorff's alpha for Premises:  0.38502389803505044
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 917
	Prediction sentences having premises: 820
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 241
	Prediction Sentence having neither claim nor premise: 387
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1297
	Sentences having claim in only one of reference or prediction: 586
	Sentences having premise in both reference and prediction: 1304
	Sentences having premise in only one of reference or prediction: 579
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(144.1732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.5240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.4515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.8025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.3300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.4093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.6916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.4454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.9803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.9402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.1266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.6417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.2643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.6851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.7416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.6407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.8004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.9037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.6879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.3367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.1324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.3389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.1375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.8418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.8085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.9169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.6815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.2126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.4134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.5328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.7606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.9301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.8297, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4338821030270844
Sentence level Krippendorff's alpha for Premises:  0.3552841210833776
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 442
	Prediction sentences having premises: 1236
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 186
	Prediction Sentence having neither claim nor premise: 391
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1350
	Sentences having claim in only one of reference or prediction: 533
	Sentences having premise in both reference and prediction: 1276
	Sentences having premise in only one of reference or prediction: 607
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(106.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.4752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.6799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.4553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.6897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.6971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.3215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.1811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.6937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.3173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.9306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.1570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.3816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.6568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.0423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.3900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.1298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.3225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.4975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.5949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.1416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.1322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.3381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.2219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.7986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.2982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.4987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.6195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.5963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.6623, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4317578332448221
Sentence level Krippendorff's alpha for Premises:  0.4540626659585767
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 838
	Prediction sentences having premises: 933
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 258
	Prediction Sentence having neither claim nor premise: 370
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1348
	Sentences having claim in only one of reference or prediction: 535
	Sentences having premise in both reference and prediction: 1369
	Sentences having premise in only one of reference or prediction: 514
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(85.7906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.9266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.9839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.6113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.4789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.3061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.0238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.4152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.7143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.3503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.0922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.9978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.9308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.2516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.3844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.5120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.1360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.2617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.5555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.2664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.4488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.9555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.0467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.5440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.9328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.4074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.0282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.8628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.2748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.4287, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4593733404142326
Sentence level Krippendorff's alpha for Premises:  0.3807753584705258
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 522
	Prediction sentences having premises: 1176
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 211
	Prediction Sentence having neither claim nor premise: 396
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1374
	Sentences having claim in only one of reference or prediction: 509
	Sentences having premise in both reference and prediction: 1300
	Sentences having premise in only one of reference or prediction: 583
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(71.1906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.0377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.2492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.1991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.1647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.8903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.2722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.0096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.2707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.2166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.8481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.4902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.1009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.5923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.0372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.3504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.7658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.6221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.2772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.4634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.7890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.5856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.8961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.0248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.8445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2589, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46362187997875726
Sentence level Krippendorff's alpha for Premises:  0.44981412639405205
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 752
	Prediction sentences having premises: 989
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 238
	Prediction Sentence having neither claim nor premise: 380
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1378
	Sentences having claim in only one of reference or prediction: 505
	Sentences having premise in both reference and prediction: 1365
	Sentences having premise in only one of reference or prediction: 518
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(55.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.7329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.3451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.7560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.5419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.0601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.6485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.1765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.7445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.7596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.5573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.9108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.6186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.6351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.2352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.6639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.9263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.1706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.1572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.1856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.5762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.3391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.5975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.2683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.4957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.7617, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4742432288900691
Sentence level Krippendorff's alpha for Premises:  0.41901221455124804
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 634
	Prediction sentences having premises: 1072
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 216
	Prediction Sentence having neither claim nor premise: 393
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1388
	Sentences having claim in only one of reference or prediction: 495
	Sentences having premise in both reference and prediction: 1336
	Sentences having premise in only one of reference or prediction: 547
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(44.0028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.6513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.7934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.7227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.6131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.4714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.5938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.0497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.9034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.1815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.1696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.7873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.7929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.4068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.3981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.1734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.1076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.1322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.5032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.7583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.5365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.4851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.7493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.5290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.9789, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.467870419543282
Sentence level Krippendorff's alpha for Premises:  0.4423791821561338
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 728
	Prediction sentences having premises: 1006
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 240
	Prediction Sentence having neither claim nor premise: 389
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1382
	Sentences having claim in only one of reference or prediction: 501
	Sentences having premise in both reference and prediction: 1358
	Sentences having premise in only one of reference or prediction: 525
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(36.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.6578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.4858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.1025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.4734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.0746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.2862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.3709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.1213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.9880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.1670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.3108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.3073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.3919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.3735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.1228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.0584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.3870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.9120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.8868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.0103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.0737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.8821, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46468401486988853
Sentence level Krippendorff's alpha for Premises:  0.4137015400955921
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 625
	Prediction sentences having premises: 1095
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 223
	Prediction Sentence having neither claim nor premise: 386
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1379
	Sentences having claim in only one of reference or prediction: 504
	Sentences having premise in both reference and prediction: 1331
	Sentences having premise in only one of reference or prediction: 552
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(30.8514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.0724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.9203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.5718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.2297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.4557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.8143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.2036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.5178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.5114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.3728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.6368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.1292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.9587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.6309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.1808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.6813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.9634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.4030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.3883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.7904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.2360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.2536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.7627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.3301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.6307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.1982, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4508762612851832
Sentence level Krippendorff's alpha for Premises:  0.45193839617631437
Additional attributes: 
	Total Sentences: 1883
	Prediction setences having claims: 808
	Prediction sentences having premises: 943
	Reference setences having claims: 713
	Reference sentences having premises: 757


	Prediction Sentence having both claim and premise: 256
	Prediction Sentence having neither claim nor premise: 388
	Reference Sentence having both claim and premise: 164
	Reference Sentence having neither claim nor premise: 577


	Sentences having claim in both reference and prediction: 1366
	Sentences having claim in only one of reference or prediction: 517
	Sentences having premise in both reference and prediction: 1367
	Sentences having premise in only one of reference or prediction: 516
				 Metric computations: None


		-------------RUN 4-----------
			------------EPOCH 1---------------
Loss:  tensor(2388.0889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2451.0239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3299.6653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3874.4065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2032.9160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2470.6968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.4607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1175.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1809.5559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1812.1464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1361.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2571.3296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1642.0162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1486.1240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1779.1492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1494.0660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2830.6240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1554.5260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1829.4451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2663.3066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2272.2139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1944.4224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2669.6753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1599.7898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1252.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1623.7833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1843.3315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1602.7634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1458.5197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1582.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2904.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1720.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1439.8268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2235.6165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2014.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2739.4368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1938.0166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1682.7068, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.19647355163727964
Sentence level Krippendorff's alpha for Premises:  0.17884130982367763
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 367
	Prediction sentences having premises: 106
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 16
	Prediction Sentence having neither claim nor premise: 1131
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 950
	Sentences having claim in only one of reference or prediction: 638
	Sentences having premise in both reference and prediction: 936
	Sentences having premise in only one of reference or prediction: 652
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2004.6768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1784.2632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2604.6287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2822.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1559.5999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1969.0085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.7983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.7332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1523.1904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1510.0077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1063.2754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2068.4463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1480.8706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.1410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1439.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1167.0210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2439.7600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1316.7229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1463.1858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2201.2090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2068.0513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1695.3762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2389.2969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1430.6343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1135.8589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1508.1514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1705.5153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1437.4415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.8097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1531.9521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2927.4500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1705.0339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1339.0552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2368.6748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1793.3112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2552.1006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1606.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1430.6648, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2178841309823678
Sentence level Krippendorff's alpha for Premises:  0.1586901763224181
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 296
	Prediction sentences having premises: 220
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 60
	Prediction Sentence having neither claim nor premise: 1132
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 967
	Sentences having claim in only one of reference or prediction: 621
	Sentences having premise in both reference and prediction: 920
	Sentences having premise in only one of reference or prediction: 668
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1567.7255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1651.5552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2069.3040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2183.1699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1250.2993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1613.9874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.2391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(777.7728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1338.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1326.4565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(963.1384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1906.3495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1348.0979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.3937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.8208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.7981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2258.6995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.6792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1289.1448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2009.9193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.4009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1554.1514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2256.3362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.5187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.6964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1427.3975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1432.7993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.8589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1127.9304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.6168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2339.0220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1502.8434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.7032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1909.3064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1714.7588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2267.4878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1486.5635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1321.9781, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24937027707808568
Sentence level Krippendorff's alpha for Premises:  0.336272040302267
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 583
	Prediction sentences having premises: 787
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 135
	Prediction Sentence having neither claim nor premise: 353
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 992
	Sentences having claim in only one of reference or prediction: 596
	Sentences having premise in both reference and prediction: 1061
	Sentences having premise in only one of reference or prediction: 527
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1342.4222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.2123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1919.8854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2049.4092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1088.8102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1473.2458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.2664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.6381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1196.4971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1190.8767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.8395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1785.4133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1243.8262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.7900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.6025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.6235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1906.7164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.1336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.9299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1784.9830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1781.6262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.8138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1907.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1196.3188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(915.8730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1355.6271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1274.6907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1002.3857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.3426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.7828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2201.7280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1524.5942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.7456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1718.3740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1569.8894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2017.2273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1332.8535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1210.2281, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.26574307304785894
Sentence level Krippendorff's alpha for Premises:  0.33249370277078083
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 656
	Prediction sentences having premises: 800
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 191
	Prediction Sentence having neither claim nor premise: 323
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1005
	Sentences having claim in only one of reference or prediction: 583
	Sentences having premise in both reference and prediction: 1058
	Sentences having premise in only one of reference or prediction: 530
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1093.2107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1397.0056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1714.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1840.1703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.9740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1302.8448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.2824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.3784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.8126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1027.4211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.7429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1619.0679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(894.6678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1011.9254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.4565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1694.2444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.4780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.0894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1569.4502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1092.5366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1667.4854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1063.4968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.2083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1198.4088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.7931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.4882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1901.5632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1358.6284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.4442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1378.9458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1379.8247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1711.2656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1159.5647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1075.2946, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32115869017632237
Sentence level Krippendorff's alpha for Premises:  0.36523929471032746
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 684
	Prediction sentences having premises: 806
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 183
	Prediction Sentence having neither claim nor premise: 281
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1049
	Sentences having claim in only one of reference or prediction: 539
	Sentences having premise in both reference and prediction: 1084
	Sentences having premise in only one of reference or prediction: 504
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(924.7007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1236.6810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1467.5096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1561.1594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.0699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.8655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.9512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.0584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(889.1574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(831.4183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.1548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1398.8357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.7575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.6121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.5341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.4563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1438.1309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.9161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.4214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.1018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1179.0854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.8984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1425.4269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(920.6503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.2006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(998.4146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(671.2717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.1473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.4278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.6898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1621.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1171.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(774.2411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.3510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1182.2893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1377.2297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(997.0560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(948.3545, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.28337531486146095
Sentence level Krippendorff's alpha for Premises:  0.3841309823677582
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 860
	Prediction sentences having premises: 653
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 175
	Prediction Sentence having neither claim nor premise: 250
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1019
	Sentences having claim in only one of reference or prediction: 569
	Sentences having premise in both reference and prediction: 1099
	Sentences having premise in only one of reference or prediction: 489
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(741.0627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.2113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.6201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1344.2271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.9140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.5869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.0865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.1370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.8209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.2904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.3121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(831.7825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.6635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.1860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.1599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1198.9453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.8369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.3041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1165.7466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.7510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.0677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1260.1855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.7509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.8984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.1292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.2365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.9379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.8980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1340.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.6465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.3934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.4404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.9932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.1575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.3924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.7590, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.21158690176322414
Sentence level Krippendorff's alpha for Premises:  0.4156171284634761
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 1063
	Prediction sentences having premises: 462
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 141
	Prediction Sentence having neither claim nor premise: 204
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 962
	Sentences having claim in only one of reference or prediction: 626
	Sentences having premise in both reference and prediction: 1124
	Sentences having premise in only one of reference or prediction: 464
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(610.1909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.1392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.4584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1269.2463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.8397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.0282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.7172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.5062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.8538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.9343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1151.1410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.7153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.6891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.4075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.5638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.9312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.1971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1033.1298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.9569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.0845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(881.0229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.6744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.5812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.4613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.8160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.5701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.5873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.1081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1520.7877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.6578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.5380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.3065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.5640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1155.0485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.1092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.6956, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4282115869017632
Sentence level Krippendorff's alpha for Premises:  0.34508816120906805
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 621
	Prediction sentences having premises: 924
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 230
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1134
	Sentences having claim in only one of reference or prediction: 454
	Sentences having premise in both reference and prediction: 1068
	Sentences having premise in only one of reference or prediction: 520
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(522.0722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.4965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(910.3936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1022.9086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.5146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.9297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.2733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.3730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.1340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.1499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.7066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.8218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.7833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.2527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.6511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(487.4342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.0582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.7670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.4285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1215.1525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1114.5923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.7255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(995.6500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.4243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.1391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.0225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.5143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.1548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.5723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.0131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1740.3716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.6499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.1899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.8630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.5104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.7814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.7755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(765.2995, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33249370277078083
Sentence level Krippendorff's alpha for Premises:  0.371536523929471
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 929
	Prediction sentences having premises: 381
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 164
	Prediction Sentence having neither claim nor premise: 442
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1058
	Sentences having claim in only one of reference or prediction: 530
	Sentences having premise in both reference and prediction: 1089
	Sentences having premise in only one of reference or prediction: 499
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(803.1931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.5155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1424.7330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1513.5142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.8929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.4208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.4066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.9807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.5109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.4246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.6730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.9628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.0709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.8774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.2691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.4375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.3411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.2170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.3276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.4117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.9618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.5381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.9477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.0519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.6022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.0519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.0911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.4917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.2136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1238.2224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.5880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.9827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.9188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.2026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.4240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(745.1750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.7258, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3136020151133502
Sentence level Krippendorff's alpha for Premises:  0.40680100755667503
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 896
	Prediction sentences having premises: 631
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 169
	Prediction Sentence having neither claim nor premise: 230
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1043
	Sentences having claim in only one of reference or prediction: 545
	Sentences having premise in both reference and prediction: 1117
	Sentences having premise in only one of reference or prediction: 471
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(393.8717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.8707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.2651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.1669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.5673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.4664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.3955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.0175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.1485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.6318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.3640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.9528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(626.5251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.9414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.2327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.2636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.6813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.6320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.4194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.6542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.8316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.0607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.4803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.5519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.6107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.7876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.9844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.6981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.1661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.7944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.8644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.5551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.9066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.6614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.1539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.0515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.8232, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4722921914357683
Sentence level Krippendorff's alpha for Premises:  0.4156171284634761
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 564
	Prediction sentences having premises: 948
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 197
	Prediction Sentence having neither claim nor premise: 273
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1169
	Sentences having claim in only one of reference or prediction: 419
	Sentences having premise in both reference and prediction: 1124
	Sentences having premise in only one of reference or prediction: 464
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(307.1446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.5948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.8580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.7428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.4657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.5328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.2521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.6913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.9271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.6020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.8225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.0063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.0519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.1018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.7475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.8491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.8154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.4879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.5901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.8192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.8580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.7176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.1980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.6100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.3950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.4185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.4142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.8674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.7941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.6314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.0303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.9704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.9204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.2887, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3904282115869018
Sentence level Krippendorff's alpha for Premises:  0.2972292191435768
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 201
	Prediction sentences having premises: 1208
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 79
	Prediction Sentence having neither claim nor premise: 258
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1104
	Sentences having claim in only one of reference or prediction: 484
	Sentences having premise in both reference and prediction: 1030
	Sentences having premise in only one of reference or prediction: 558
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(703.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.0941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.1342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.7600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.0778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.2369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.7602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.8246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.5443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.0690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.0270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.4752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.6377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(520.0268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.3306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.4712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.0448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.2334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.1874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.1620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.5352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.0324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.3849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.9635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1217.9960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.7876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(705.6818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.9423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1573.1082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1681.2224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.4955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.4477, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2846347607052897
Sentence level Krippendorff's alpha for Premises:  0.37279596977329976
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 1033
	Prediction sentences having premises: 394
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 163
	Prediction Sentence having neither claim nor premise: 324
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1020
	Sentences having claim in only one of reference or prediction: 568
	Sentences having premise in both reference and prediction: 1090
	Sentences having premise in only one of reference or prediction: 498
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(356.7061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.1623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.7853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.7694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.8190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.1282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.3720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.4417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.3221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.8583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.9387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.5067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.3404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.0151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.0498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.3745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.9435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.6135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.8799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.8903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.6205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.9193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.9616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.2230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.1974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.4381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.7295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.1137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.2303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.6822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.8676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.3188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.2877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.0735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.0670, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4773299748110831
Sentence level Krippendorff's alpha for Premises:  0.3790931989924433
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 482
	Prediction sentences having premises: 1055
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 205
	Prediction Sentence having neither claim nor premise: 256
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1173
	Sentences having claim in only one of reference or prediction: 415
	Sentences having premise in both reference and prediction: 1095
	Sentences having premise in only one of reference or prediction: 493
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(202.4817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.2767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.8012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.0287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.9941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.9573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.5605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.6715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.3779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.7025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.0995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.7512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.2870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.0992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.0618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.6441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.3252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.4604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.0661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.7477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.1057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.2372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.0043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.3073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.5796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.0244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.5112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.5516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.6812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.9734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.5947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.7408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.1323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.8196, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4332493702770781
Sentence level Krippendorff's alpha for Premises:  0.4508816120906801
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 787
	Prediction sentences having premises: 700
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 199
	Prediction Sentence having neither claim nor premise: 300
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1138
	Sentences having claim in only one of reference or prediction: 450
	Sentences having premise in both reference and prediction: 1152
	Sentences having premise in only one of reference or prediction: 436
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(161.7718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.8443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.5642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.9429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.5321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.7925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.0287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.6414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.0011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.8708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.6535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.5688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.3163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.5261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.0076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.3955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.1852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.6917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.8952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.7469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.6323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.5673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.5195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.3235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.0380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.0593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.7685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.4041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.1770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.6534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.1104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.8397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.5578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.2431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.6204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.2591, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.47858942065491183
Sentence level Krippendorff's alpha for Premises:  0.42947103274559195
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 603
	Prediction sentences having premises: 903
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 191
	Prediction Sentence having neither claim nor premise: 273
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1174
	Sentences having claim in only one of reference or prediction: 414
	Sentences having premise in both reference and prediction: 1135
	Sentences having premise in only one of reference or prediction: 453
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(113.4679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.3568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.9413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.6016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.6514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.8120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.1512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.6879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.4760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.4093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.5750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.8918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.7177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.6452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.1523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.7062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.9315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.4919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.3365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.1593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.1785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.6594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.2473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.7466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.2813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.4599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.0262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.2430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.9489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.2806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.5738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.9793, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.47607052896725444
Sentence level Krippendorff's alpha for Premises:  0.4483627204030227
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 677
	Prediction sentences having premises: 840
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 199
	Prediction Sentence having neither claim nor premise: 270
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1172
	Sentences having claim in only one of reference or prediction: 416
	Sentences having premise in both reference and prediction: 1150
	Sentences having premise in only one of reference or prediction: 438
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(91.4528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.9418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.2119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.3309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.7492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.4232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.7918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.0637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.1827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.8732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.2698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.8740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.5975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.2276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.5897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.1989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.7351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.9206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.9257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.2561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.2586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.5193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.6524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.3615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.1848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.2217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.6797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.6552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.8536, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4735516372795969
Sentence level Krippendorff's alpha for Premises:  0.43198992443324935
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 659
	Prediction sentences having premises: 827
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 188
	Prediction Sentence having neither claim nor premise: 290
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1170
	Sentences having claim in only one of reference or prediction: 418
	Sentences having premise in both reference and prediction: 1137
	Sentences having premise in only one of reference or prediction: 451
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(76.6199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.4529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.3864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.8064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.2280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.7486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.2301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.7911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.9847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.7583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.0584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.4416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.5760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.1384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.9854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.2409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.4767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.4380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.8234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.9836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.3418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.3094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.2206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.0273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.9183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.8634, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4748110831234257
Sentence level Krippendorff's alpha for Premises:  0.42947103274559195
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 668
	Prediction sentences having premises: 847
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 192
	Prediction Sentence having neither claim nor premise: 265
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1171
	Sentences having claim in only one of reference or prediction: 417
	Sentences having premise in both reference and prediction: 1135
	Sentences having premise in only one of reference or prediction: 453
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(63.3666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.7288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.8582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.9990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.8709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.2962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.0220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.7836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.6977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.9491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.6250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.2223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.2804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.0028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.4060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.9224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.8306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.2582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.4830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.6063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.2891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.9547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.1207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.1352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.1259, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.464735516372796
Sentence level Krippendorff's alpha for Premises:  0.43450881612090675
Additional attributes: 
	Total Sentences: 1588
	Prediction setences having claims: 676
	Prediction sentences having premises: 827
	Reference setences having claims: 579
	Reference sentences having premises: 680


	Prediction Sentence having both claim and premise: 197
	Prediction Sentence having neither claim nor premise: 282
	Reference Sentence having both claim and premise: 119
	Reference Sentence having neither claim nor premise: 448


	Sentences having claim in both reference and prediction: 1163
	Sentences having claim in only one of reference or prediction: 425
	Sentences having premise in both reference and prediction: 1139
	Sentences having premise in only one of reference or prediction: 449
				 Metric computations: None


		-------------RUN 5-----------
			------------EPOCH 1---------------
Loss:  tensor(1665.5729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1505.7051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2504.7397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3199.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3618.6479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2749.3372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2385.4800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2293.8223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2682.4648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2224.6294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2917.2866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1799.5148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.7495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1415.9941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2894.4116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1750.1898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1729.4301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1727.0372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1742.1628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1809.3000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3311.3789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1347.6138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.0474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1851.7150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2984.6008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1322.8647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2612.9849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3217.8831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.0461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3228.2769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2753.1421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2233.9932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1157.6339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1874.3997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2964.2266, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32626262626262625
Sentence level Krippendorff's alpha for Premises:  0.11717171717171715
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 351
	Prediction sentences having premises: 711
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 275
	Prediction Sentence having neither claim nor premise: 1193
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1313
	Sentences having claim in only one of reference or prediction: 667
	Sentences having premise in both reference and prediction: 1106
	Sentences having premise in only one of reference or prediction: 874
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1149.4192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.6095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1675.2061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2160.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2511.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2031.9741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2171.2920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1759.1611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1995.1221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1765.4607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2653.9297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1524.2780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.7880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.9557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2514.6975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1536.7759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1559.6466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1526.3694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1611.0933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1650.8149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2764.8933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1165.3962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.7700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1704.1589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2527.3442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.5554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2235.1299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2996.5095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1273.4490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2919.2441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2456.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1948.5203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.8982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1578.2511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2665.5908, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2525252525252525
Sentence level Krippendorff's alpha for Premises:  0.295959595959596
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 846
	Prediction sentences having premises: 858
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 459
	Prediction Sentence having neither claim nor premise: 735
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1240
	Sentences having claim in only one of reference or prediction: 740
	Sentences having premise in both reference and prediction: 1283
	Sentences having premise in only one of reference or prediction: 697
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(989.3451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1483.0929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1953.7994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2145.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1717.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1858.8423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1484.7419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1603.9272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1541.2405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2378.0708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1349.8223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1323.1046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(990.7527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2083.9810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1349.1924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1263.3711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1212.1130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1386.6064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1576.9597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2352.4163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(956.7375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1628.5844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2076.9617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(906.1857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1925.8846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2876.1384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1048.0885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2650.7292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2175.3704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1563.6437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.3510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1310.9871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2401.4124, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.19191919191919193
Sentence level Krippendorff's alpha for Premises:  0.33939393939393936
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 1056
	Prediction sentences having premises: 1041
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 646
	Prediction Sentence having neither claim nor premise: 529
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1180
	Sentences having claim in only one of reference or prediction: 800
	Sentences having premise in both reference and prediction: 1326
	Sentences having premise in only one of reference or prediction: 654
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(864.8492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.9187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1252.5332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1654.7417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1788.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.3667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1651.6044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.7560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1246.2959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1310.4246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2057.9475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1178.4587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.3281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.5436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1805.4202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1088.5114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.2008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.2836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.0389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1370.2727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2111.0525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.9714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.2520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1455.2017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1774.7224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.3140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2410.8228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(895.4525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2354.5103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1898.5820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.1104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.1069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1096.7866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2126.5542, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.23535353535353531
Sentence level Krippendorff's alpha for Premises:  0.3535353535353535
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 961
	Prediction sentences having premises: 963
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 396
	Prediction Sentence having neither claim nor premise: 452
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1223
	Sentences having claim in only one of reference or prediction: 757
	Sentences having premise in both reference and prediction: 1340
	Sentences having premise in only one of reference or prediction: 640
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(739.7472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.7357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1328.7979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1364.8342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.7263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1415.7249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1020.3341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(954.2015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.2559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.0548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.5486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.9492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.8400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1470.7041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.7646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.0828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.8649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.9751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1114.6080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1848.5343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.8633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1201.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1428.0046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.0079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.9922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1802.9059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.0754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1941.7415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1621.7887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.9252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.2410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(894.8075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1892.7856, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.30808080808080807
Sentence level Krippendorff's alpha for Premises:  0.3424242424242424
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 963
	Prediction sentences having premises: 1108
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 397
	Prediction Sentence having neither claim nor premise: 306
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1295
	Sentences having claim in only one of reference or prediction: 685
	Sentences having premise in both reference and prediction: 1329
	Sentences having premise in only one of reference or prediction: 651
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(603.0730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.4098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.0238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.6393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1085.0427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(898.8929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.0236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.4324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.0099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.9495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1326.1775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.7919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.6337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.8373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1156.7678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.0692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.9339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.8055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(907.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1721.3379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.2643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.4045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.3655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1260.5457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.9535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.2371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1483.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.4512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1541.0848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.1305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.5804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.9883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(740.9346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1459.1234, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3606060606060606
Sentence level Krippendorff's alpha for Premises:  0.3363636363636363
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 753
	Prediction sentences having premises: 1258
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 311
	Prediction Sentence having neither claim nor premise: 280
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1347
	Sentences having claim in only one of reference or prediction: 633
	Sentences having premise in both reference and prediction: 1323
	Sentences having premise in only one of reference or prediction: 657
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(421.2263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.9720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.6418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(860.8116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.8744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(781.1779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1135.3967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.6184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.6074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.4821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1468.1958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.9987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.2073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.4005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.6709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.4323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.6545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.7939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.6307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.8486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1448.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.6562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.9597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(984.9816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.5556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.8163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1190.4426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.5967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1467.1230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.4841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.5573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.0972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.5734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1440.2061, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4444444444444444
Sentence level Krippendorff's alpha for Premises:  0.35858585858585856
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 374
	Prediction sentences having premises: 1332
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 461
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1430
	Sentences having claim in only one of reference or prediction: 550
	Sentences having premise in both reference and prediction: 1345
	Sentences having premise in only one of reference or prediction: 635
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(379.5049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.8764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.9526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.4332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.2715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.0741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1204.5082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.0093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.6517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.1640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.7556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.9536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.5826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.0542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.6915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.8801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.7720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.6215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(946.5586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2401.1880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.4524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.0767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1254.7378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1348.2825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.4866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.8078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2012.3844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.0830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.1240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.0096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.4324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.5121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.7219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.9156, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41515151515151516
Sentence level Krippendorff's alpha for Premises:  0.25151515151515147
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 219
	Prediction sentences having premises: 1536
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 113
	Prediction Sentence having neither claim nor premise: 338
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1401
	Sentences having claim in only one of reference or prediction: 579
	Sentences having premise in both reference and prediction: 1239
	Sentences having premise in only one of reference or prediction: 741
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(374.3772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.8008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.1798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.3056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.8546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.7759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.2450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.1846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.0225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.1526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(853.7064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.3598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1367.8207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(626.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.0798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.6997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.2753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.6575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1325.2388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.0590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.8876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(766.4943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.6293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.2274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1216.9058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.6925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.4161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1085.6552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.7679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.0285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.5078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.0647, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24343434343434345
Sentence level Krippendorff's alpha for Premises:  0.4363636363636364
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 1089
	Prediction sentences having premises: 829
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 311
	Prediction Sentence having neither claim nor premise: 373
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1231
	Sentences having claim in only one of reference or prediction: 749
	Sentences having premise in both reference and prediction: 1422
	Sentences having premise in only one of reference or prediction: 558
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(374.0092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.9648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.9706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.0784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.1356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.7670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.2728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.6992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.1114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.6274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.3289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.9100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.7038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.4843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.3474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.7651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.9575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.9051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(878.7600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1724.9617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.9073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.4159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.1707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1163.9591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.7350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.8549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1706.9706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.4579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.6716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.2049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.9215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.5185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.7853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.9415, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32525252525252524
Sentence level Krippendorff's alpha for Premises:  0.38787878787878793
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 944
	Prediction sentences having premises: 1091
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 347
	Prediction Sentence having neither claim nor premise: 292
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1312
	Sentences having claim in only one of reference or prediction: 668
	Sentences having premise in both reference and prediction: 1374
	Sentences having premise in only one of reference or prediction: 606
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(210.6786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.1968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.6223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.7686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.9365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.1368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.6882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.1952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.7460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.8407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.3212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.0389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.6935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.9465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.0367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(964.3360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.3370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.4091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.2129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.9852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.4291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.7876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.1284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.0062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(712.7727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(560.8953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.7480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.6550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.2649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.6057, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40909090909090906
Sentence level Krippendorff's alpha for Premises:  0.4101010101010101
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 657
	Prediction sentences having premises: 1145
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 248
	Prediction Sentence having neither claim nor premise: 426
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1395
	Sentences having claim in only one of reference or prediction: 585
	Sentences having premise in both reference and prediction: 1396
	Sentences having premise in only one of reference or prediction: 584
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(120.9795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.9592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.7133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.9698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.1257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.0873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.2690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.0712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.4291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.0806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.6842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.6622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.0877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.4955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.3185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.6721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.6451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.0603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.1692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.0318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.5251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(671.9946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.9565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.9095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.5187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.9762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.9469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.6594, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3919191919191919
Sentence level Krippendorff's alpha for Premises:  0.37777777777777777
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 706
	Prediction sentences having premises: 1197
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 269
	Prediction Sentence having neither claim nor premise: 346
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1378
	Sentences having claim in only one of reference or prediction: 602
	Sentences having premise in both reference and prediction: 1364
	Sentences having premise in only one of reference or prediction: 616
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(72.0542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.8713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.0902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.4210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.4020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.9675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.7753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.2219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.7696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.4033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(478.3348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.0812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.0155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.4190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.8444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.2287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.0869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.0082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.6948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.9763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.8525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.4913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.3190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.8868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.3418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.5264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.5058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.0513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.1542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.9118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.6025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.8354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.0052, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.407070707070707
Sentence level Krippendorff's alpha for Premises:  0.4050505050505051
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 717
	Prediction sentences having premises: 1120
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 265
	Prediction Sentence having neither claim nor premise: 408
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1393
	Sentences having claim in only one of reference or prediction: 587
	Sentences having premise in both reference and prediction: 1391
	Sentences having premise in only one of reference or prediction: 589
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(55.5644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.0638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.9568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.3835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.0039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.9864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.4913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.6233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.4474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.7304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.7114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.9254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.4978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.6237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.0760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.0992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.6914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.6484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.8463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.0763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.4861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.5803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.7031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.9341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.0500, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41515151515151516
Sentence level Krippendorff's alpha for Premises:  0.3919191919191919
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 657
	Prediction sentences having premises: 1213
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 251
	Prediction Sentence having neither claim nor premise: 361
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1401
	Sentences having claim in only one of reference or prediction: 579
	Sentences having premise in both reference and prediction: 1378
	Sentences having premise in only one of reference or prediction: 602
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(27.7982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.1375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.4128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.6193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.9170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.1440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.8197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.9132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.5717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.0607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.8310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.9593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.0036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.2814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.7771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.2011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.7735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.1478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.3899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.7211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.0959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.3335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.3412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.6605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.9426, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3939393939393939
Sentence level Krippendorff's alpha for Premises:  0.4111111111111111
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 714
	Prediction sentences having premises: 1120
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 251
	Prediction Sentence having neither claim nor premise: 397
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1380
	Sentences having claim in only one of reference or prediction: 600
	Sentences having premise in both reference and prediction: 1397
	Sentences having premise in only one of reference or prediction: 583
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(20.8948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.3368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.1032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.2125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.7623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.9061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.5429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.6390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.1849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.9602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.2425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.3370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.3874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.1620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.5484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.9404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.7913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.1014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.8505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.0336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.7502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.7241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.6413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.1320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.0709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.3069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.4110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.8736, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4131313131313131
Sentence level Krippendorff's alpha for Premises:  0.40303030303030307
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 661
	Prediction sentences having premises: 1208
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 254
	Prediction Sentence having neither claim nor premise: 365
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1399
	Sentences having claim in only one of reference or prediction: 581
	Sentences having premise in both reference and prediction: 1389
	Sentences having premise in only one of reference or prediction: 591
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(12.7405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.2088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.4216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.8297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.3570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.9828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.8927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.6348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.1605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.7921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.3548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.2596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.5817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.1048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.3657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.5420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.5913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.3349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.6135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.4001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.0852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.9632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.1100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.9548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1575, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.406060606060606
Sentence level Krippendorff's alpha for Premises:  0.42020202020202024
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 678
	Prediction sentences having premises: 1163
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 240
	Prediction Sentence having neither claim nor premise: 379
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1392
	Sentences having claim in only one of reference or prediction: 588
	Sentences having premise in both reference and prediction: 1406
	Sentences having premise in only one of reference or prediction: 574
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(10.1284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.3246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.8808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.6519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.9720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.1715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.6363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.7711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.6909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.1834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.5296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.0043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.8518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.8141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.9519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.6991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.7808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.9805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.0866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.9556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.4278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.8556, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4101010101010101
Sentence level Krippendorff's alpha for Premises:  0.3919191919191919
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 660
	Prediction sentences having premises: 1203
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 246
	Prediction Sentence having neither claim nor premise: 363
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1396
	Sentences having claim in only one of reference or prediction: 584
	Sentences having premise in both reference and prediction: 1378
	Sentences having premise in only one of reference or prediction: 602
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(7.5415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.6183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.3196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.0679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.7877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.1768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.3403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.5565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.1124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.4133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.1331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.2682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.9476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.1318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.3743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.0149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.5864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.5301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.2870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.4813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.8345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.2579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.0344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.7082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.4392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9743, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.406060606060606
Sentence level Krippendorff's alpha for Premises:  0.40909090909090906
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 676
	Prediction sentences having premises: 1178
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 244
	Prediction Sentence having neither claim nor premise: 370
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1392
	Sentences having claim in only one of reference or prediction: 588
	Sentences having premise in both reference and prediction: 1395
	Sentences having premise in only one of reference or prediction: 585
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(6.1066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.0179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.4123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.5663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.8119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.1787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.9181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.5154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.5994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.1698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.7987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.0836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.8980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.5639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.7136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.2385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.7813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.3741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.4775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.4621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.5207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.7552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.4880, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40808080808080804
Sentence level Krippendorff's alpha for Premises:  0.4
Additional attributes: 
	Total Sentences: 1980
	Prediction setences having claims: 664
	Prediction sentences having premises: 1193
	Reference setences having claims: 688
	Reference sentences having premises: 857


	Prediction Sentence having both claim and premise: 240
	Prediction Sentence having neither claim nor premise: 363
	Reference Sentence having both claim and premise: 138
	Reference Sentence having neither claim nor premise: 573


	Sentences having claim in both reference and prediction: 1394
	Sentences having claim in only one of reference or prediction: 586
	Sentences having premise in both reference and prediction: 1386
	Sentences having premise in only one of reference or prediction: 594
				 Metric computations: None
