Tokenizer: ../arg_m/arg_mining/smlm_pretrained_iter5_0/tokenizer Model: ../arg_m/arg_mining/smlm_pretrained_iter5_0/model
	Train size: 80 Test size: 20


		-------------RUN 1-----------
			------------EPOCH 1---------------
Loss:  tensor(1632.2418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2598.6572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3458.1455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1800.8002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2668.1836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2827.6309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2782.7485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3393.3445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2916.7246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2084.4880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1818.7637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1738.1309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2578.5186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1551.4884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1462.2534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1677.2603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3840.3020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2980.3809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2396.5635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1614.4727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2534.4299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2305.9814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1174.3910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.2766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2400.5493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2927.4126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1746.6819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1485.5341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1282.8342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2079.8115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.0979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1094.7063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.4135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1440.7991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2671.2456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2573.0049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1687.8679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2146.0493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.7069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1805.7107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.0736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1934.7933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(990.4169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2040.9919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2001.0164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1626.1879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2138.3159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1431.7059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1676.3534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1152.2974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.0918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1336.2603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2576.3999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2398.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3041.6157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.4449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.1018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1281.2173, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3143712574850299
Sentence level Krippendorff's alpha for Premises:  0.16167664670658688
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 60
	Prediction sentences having premises: 359
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 17
	Prediction Sentence having neither claim nor premise: 266
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 439
	Sentences having claim in only one of reference or prediction: 229
	Sentences having premise in both reference and prediction: 388
	Sentences having premise in only one of reference or prediction: 280
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1088.8715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1572.4885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2246.0503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1233.9663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2095.7876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2144.4116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2063.4119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2668.1201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2400.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1724.5236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1503.8121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1436.2831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2301.1099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.1794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1209.9075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1372.0730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3235.0652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2493.6445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1781.7687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1198.8037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2050.1709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2103.2861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.7086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1280.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2099.4788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2452.1892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.9490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.8525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.1718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.2932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(946.5038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.4650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.9071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2405.3916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2302.5613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1402.8567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1763.5952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1102.1848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1541.1968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.1936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1683.6641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.1733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1752.4814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.0555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1347.5714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1800.2535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.7764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.0040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.1939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.0714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.6581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2301.5452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1980.6444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2490.9580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1154.2271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.3901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.7209, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.29640718562874246
Sentence level Krippendorff's alpha for Premises:  0.39820359281437123
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 144
	Prediction sentences having premises: 466
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 57
	Prediction Sentence having neither claim nor premise: 115
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 433
	Sentences having claim in only one of reference or prediction: 235
	Sentences having premise in both reference and prediction: 467
	Sentences having premise in only one of reference or prediction: 201
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(900.9854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1260.1458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1836.6589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.4590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1990.3296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2165.9819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1973.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2321.2764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2074.5322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1414.4717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1156.7900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.1281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1959.9485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1038.2778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1020.0902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1130.6763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2760.2029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2190.6580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(933.2487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1793.6462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1943.8574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.4808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(999.6046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1932.6036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2254.6279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1382.8291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(948.2550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.2142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1459.2207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.6470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.0900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.3904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.0605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2304.5293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2197.0920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1210.0183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1596.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.1116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1374.7458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.5233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1525.5841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1511.4143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.5889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1098.6842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1471.4325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.6389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1040.3589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.5651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.3121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(833.4193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1927.6095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1568.8032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.4772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(890.9173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.9219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.3699, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3802395209580839
Sentence level Krippendorff's alpha for Premises:  0.39520958083832336
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 122
	Prediction sentences having premises: 481
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 34
	Prediction Sentence having neither claim nor premise: 99
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 461
	Sentences having claim in only one of reference or prediction: 207
	Sentences having premise in both reference and prediction: 466
	Sentences having premise in only one of reference or prediction: 202
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(750.0319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.5204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1571.3381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.4001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1827.4482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2094.1602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.9548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2009.7581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1812.2533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1121.6106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.2855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.1439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1807.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(884.1731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.6984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.9249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2451.6719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1913.9869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.6749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.2963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1446.8687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1796.5604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.4703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.5421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1562.9272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1836.0637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1332.5903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.8818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(698.9055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.9695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.5317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.4562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.8241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.7377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.2600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1743.8712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.0516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.2355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.5679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1297.6067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.9913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1542.1541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.6086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1447.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1349.3898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.8021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1281.7534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.4241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.3832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.0288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.2302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(624.2614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1419.0052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.6389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1739.1771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(696.4009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.9387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.5125, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4011976047904192
Sentence level Krippendorff's alpha for Premises:  0.5029940119760479
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 235
	Prediction sentences having premises: 411
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 73
	Prediction Sentence having neither claim nor premise: 95
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 468
	Sentences having claim in only one of reference or prediction: 200
	Sentences having premise in both reference and prediction: 502
	Sentences having premise in only one of reference or prediction: 166
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(597.4985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.2291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1277.6355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.6812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.3315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1895.1615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.1427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1657.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1536.7756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.5660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.7454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.6982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1686.2905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.1005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.0187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.5858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2306.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1524.2147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1077.0787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.7976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1242.5636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1727.4631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.3399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.8666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.0919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1639.2676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1300.8560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.1672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.4369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.1864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.2306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.9979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.0000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.8793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1309.9139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1373.7324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.1279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.9879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.9618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1065.0278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.1708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.7457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(433.3004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.6799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.7863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.5251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.4996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.9672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.0471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.0195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.2855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.9873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.0275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1603.0269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.3985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.6834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.5900, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3293413173652695
Sentence level Krippendorff's alpha for Premises:  0.4730538922155688
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 325
	Prediction sentences having premises: 289
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 137
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 444
	Sentences having claim in only one of reference or prediction: 224
	Sentences having premise in both reference and prediction: 492
	Sentences having premise in only one of reference or prediction: 176
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(695.5554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.3787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1413.2913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.0997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.7333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.4628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1073.0709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1428.1315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.3820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.7008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.6354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.6143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(712.3608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.3422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.7730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2369.4624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.1848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.9235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.4625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.0233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1610.4524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.2192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1160.6066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1405.0549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1296.6227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.1354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.2577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.0859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.0010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.5733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.0872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.2824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.8317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1650.5361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1151.4852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1083.3904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.9788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(986.2150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.9030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.8505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.1886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.3104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.6077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.9949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.4973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.9470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.2725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.2756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.8917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.8238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1098.8687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1100.4033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1377.4712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.4941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.2556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.2067, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4730538922155688
Sentence level Krippendorff's alpha for Premises:  0.5119760479041916
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 101
	Prediction sentences having premises: 378
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 25
	Prediction Sentence having neither claim nor premise: 214
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 492
	Sentences having claim in only one of reference or prediction: 176
	Sentences having premise in both reference and prediction: 505
	Sentences having premise in only one of reference or prediction: 163
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(483.2193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.1742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1303.1488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.1294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1340.4404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.7568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(941.6000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.7109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1044.4895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.9251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.2082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.1851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1017.1910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.9684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.4357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1715.6063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.9000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.3010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.7549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.6063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1141.4958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.1370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.5637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.0098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1143.5366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.6772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.7230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.4178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.6866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.1066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.5370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.7954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.8842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.1133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.5499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.5884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.7686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.5273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.3337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.3108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.8814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.0304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(831.1899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(560.4064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.1946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.6418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.3823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.0975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.1345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.0985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.7087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.1542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(382.1998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.6545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.8282, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4491017964071856
Sentence level Krippendorff's alpha for Premises:  0.4640718562874252
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 155
	Prediction sentences having premises: 434
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 53
	Prediction Sentence having neither claim nor premise: 132
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 484
	Sentences having claim in only one of reference or prediction: 184
	Sentences having premise in both reference and prediction: 489
	Sentences having premise in only one of reference or prediction: 179
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(222.8418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.8171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.6660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.4139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1019.5637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(852.2664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(766.7068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.2505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(888.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.7451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.1848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.8721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(884.3163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.5632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.5630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.4170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1440.2114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.2399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.7441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.9163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.0674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.1743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.9303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.1014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.8541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.1228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.3835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.8986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.8488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.4389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.9077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.3511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.2972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.7956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(854.4568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.4775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.8153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.1505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.0288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.9059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.4257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.0690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.1234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.8496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.1046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.1414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.3084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.0052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.0706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.0587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.0656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.2142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.5916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(740.2844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.7075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.5066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.6977, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4760479041916168
Sentence level Krippendorff's alpha for Premises:  0.4880239520958084
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 146
	Prediction sentences having premises: 440
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 46
	Prediction Sentence having neither claim nor premise: 128
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 493
	Sentences having claim in only one of reference or prediction: 175
	Sentences having premise in both reference and prediction: 497
	Sentences having premise in only one of reference or prediction: 171
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(132.0407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.2565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.0383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.0371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.5040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.3229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.6885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.7760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.7698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.2429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.6819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.7491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.2766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.4717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.4844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1211.0981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.3677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.0812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.0799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.1752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.1885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.1554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.4813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.7577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.9414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.5861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.7782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.3003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.4774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.5629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.8615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.8432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(545.1254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.4165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.9741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.4509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.5466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.1021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.5863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.4519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.7000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.5501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.0430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.9195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.3126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.3325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.6754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.1654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.7000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.0547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.6827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.9829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.7916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.0735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.0176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.4423, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4281437125748503
Sentence level Krippendorff's alpha for Premises:  0.47005988023952094
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 196
	Prediction sentences having premises: 398
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 47
	Prediction Sentence having neither claim nor premise: 121
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 477
	Sentences having claim in only one of reference or prediction: 191
	Sentences having premise in both reference and prediction: 491
	Sentences having premise in only one of reference or prediction: 177
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(114.2429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.9654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.4359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.4481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.9571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.8515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.8005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.4322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.7397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.5218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.1360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.1956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.1282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.6630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.1740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.9420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.0257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.0245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.4783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.4474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.3980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.0334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.2620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.1888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.1832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.8478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.7587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.5178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.8993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.2035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.7516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.8983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.2274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.1390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.3375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.6581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.8459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.1977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.0800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.4727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.2125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.5447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.4268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.4208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.1088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.0106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.9107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.6873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.3409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1045.5977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.0418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.5578, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.25149700598802394
Sentence level Krippendorff's alpha for Premises:  0.34730538922155685
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 429
	Prediction sentences having premises: 177
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 78
	Prediction Sentence having neither claim nor premise: 140
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 418
	Sentences having claim in only one of reference or prediction: 250
	Sentences having premise in both reference and prediction: 450
	Sentences having premise in only one of reference or prediction: 218
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(200.9888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.2538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.3323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.6965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.7563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.1234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.0119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.0192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.2972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.8670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.0375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.4052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.3265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.0045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.1391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.0377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.6194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.7635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.9071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.8402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(930.0663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.1838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(966.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.2283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.3199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.4664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.4599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.1086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.6569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.5398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.6273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.2100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.8270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.2072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.4352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.2037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.3584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.9356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.0644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.8320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.0861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.2895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.4358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.9554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.5694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.3073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.9982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.6034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.8633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.4582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.6890, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.050898203592814384
Sentence level Krippendorff's alpha for Premises:  0.14970059880239517
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 512
	Prediction sentences having premises: 39
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 16
	Prediction Sentence having neither claim nor premise: 133
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 351
	Sentences having claim in only one of reference or prediction: 317
	Sentences having premise in both reference and prediction: 384
	Sentences having premise in only one of reference or prediction: 284
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(1237.2585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1356.2601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.6514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.9961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.7227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.5795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1984.9042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.5081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.0961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.5303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.1466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.2294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.7628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.9958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.0750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(935.2968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.0259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.5325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.1762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.3823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.3651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.7101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.0363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.9893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.4661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.6591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.2021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.6070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.6741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.5167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.5625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.7760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.9941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.4309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.6768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.3403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.9768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.6437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.3087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.9809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.9909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.0344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.5963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.9369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.7380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.4239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.5014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.8047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.8766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.7906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.2596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(663.6973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.2371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.8396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6371, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3922155688622755
Sentence level Krippendorff's alpha for Premises:  0.4520958083832335
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 320
	Prediction sentences having premises: 320
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 80
	Prediction Sentence having neither claim nor premise: 108
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 465
	Sentences having claim in only one of reference or prediction: 203
	Sentences having premise in both reference and prediction: 485
	Sentences having premise in only one of reference or prediction: 183
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(97.4228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.8489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.3225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.0208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.4359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.6794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.7995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(463.5453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.7687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.3931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.6603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.1414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.6944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.0494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.0227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.6552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.7401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.4510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.5883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.6746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.3425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.0908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.7734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.9372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.5173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.6156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.9710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.7884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.9852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.1450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.9239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.6272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.6592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.8060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.7526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.3223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.5150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.5392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.0678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.6927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.8745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.8315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.3915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.1584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.9372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.5650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.0718, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43113772455089816
Sentence level Krippendorff's alpha for Premises:  0.4610778443113772
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 189
	Prediction sentences having premises: 407
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 53
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 478
	Sentences having claim in only one of reference or prediction: 190
	Sentences having premise in both reference and prediction: 488
	Sentences having premise in only one of reference or prediction: 180
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(73.3019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.8408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.6728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.8348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(581.0253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.3463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.6659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.5820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.8877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.8158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.1022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.1364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.3940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.1688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.3178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.9899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.7190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.7272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.5105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.6998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.5811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.1787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.8130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.1051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.7206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.4909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.2477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.4117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.1664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.8998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.5014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.1824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.3280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.8122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.4384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.2681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.1913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.6826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.9922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.4905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.6324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.9558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.1731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.0493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.4309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.6852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.2848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.6955, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.437125748502994
Sentence level Krippendorff's alpha for Premises:  0.5029940119760479
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 209
	Prediction sentences having premises: 345
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 56
	Prediction Sentence having neither claim nor premise: 170
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 480
	Sentences having claim in only one of reference or prediction: 188
	Sentences having premise in both reference and prediction: 502
	Sentences having premise in only one of reference or prediction: 166
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(84.6555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.7782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.7939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.8301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.1122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.1872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.3474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.4203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.6149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.7826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.4332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.3777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.5540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.2980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.3138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.1015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.4343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.8389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.9363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.7271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.7158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.9635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.3856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.6702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.5721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.7650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.4307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.4960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.1007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.9268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.7591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.0635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.1786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.1550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.0233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.2417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.8183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.9210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.0483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.6041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.5672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.3160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.9136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.3394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.5445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.9866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.7101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.4093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.1343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.8929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6072, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4251497005988024
Sentence level Krippendorff's alpha for Premises:  0.5029940119760479
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 261
	Prediction sentences having premises: 371
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 117
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 476
	Sentences having claim in only one of reference or prediction: 192
	Sentences having premise in both reference and prediction: 502
	Sentences having premise in only one of reference or prediction: 166
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(40.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.4371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.8828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.7415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.9144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.3413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.6782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.7396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.3604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.3372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.7970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.8897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.0405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.9245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.2192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.3292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.6950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.5084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.0716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.7111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.5604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.5790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.0709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.9509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.4003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.2442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.8008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.9952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.8225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.4166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.5017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.6692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.2673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.1721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.8789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.8165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.8525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2306, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4131736526946108
Sentence level Krippendorff's alpha for Premises:  0.4880239520958084
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 259
	Prediction sentences having premises: 364
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 127
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 472
	Sentences having claim in only one of reference or prediction: 196
	Sentences having premise in both reference and prediction: 497
	Sentences having premise in only one of reference or prediction: 171
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(27.4996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.4171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.7177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.4607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.5746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.6467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.0540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.7983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.9335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.6141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.9752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.9050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.2536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.6713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.6263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.3184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.0788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.5711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.3274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.5031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.4820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.6447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.7674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.7419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.2227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.1120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.4840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.9874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.5766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.6543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.7179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.0452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.0946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.5692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.3695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.2054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.1510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.1765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.8807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.9503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.8115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.0020, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41916167664670656
Sentence level Krippendorff's alpha for Premises:  0.48203592814371254
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 245
	Prediction sentences having premises: 372
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 79
	Prediction Sentence having neither claim nor premise: 130
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 474
	Sentences having claim in only one of reference or prediction: 194
	Sentences having premise in both reference and prediction: 495
	Sentences having premise in only one of reference or prediction: 173
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(20.2238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.9391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.2648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.2746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.5016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.4010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.7250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.4699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.5647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.1973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.8956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.4017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.7616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.5535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.7709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.9128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.9599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.7137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.3800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.3749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.3351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.7235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.4253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.8761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.6477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.2833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.2966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.3272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.6718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.9724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.7305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.9334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.2372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.2287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.0209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.2707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.5317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.2599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.4194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.2722, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4131736526946108
Sentence level Krippendorff's alpha for Premises:  0.49101796407185627
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 301
	Prediction sentences having premises: 333
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 92
	Prediction Sentence having neither claim nor premise: 126
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 472
	Sentences having claim in only one of reference or prediction: 196
	Sentences having premise in both reference and prediction: 498
	Sentences having premise in only one of reference or prediction: 170
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(14.5638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.0298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.8850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.6118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.7250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.2986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.5306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.5814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.9030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.7537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.9154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.1625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.8980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.6118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.8001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.3277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.2832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.3751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.9520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.8292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.4193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.5380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.7891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.4807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.7888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.1613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.8165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.7381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.3195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.4838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.9734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.3941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.0058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.3040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.6474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.4491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.7020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.7512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.8312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.2099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.3153, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4131736526946108
Sentence level Krippendorff's alpha for Premises:  0.49101796407185627
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 253
	Prediction sentences having premises: 373
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 125
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 472
	Sentences having claim in only one of reference or prediction: 196
	Sentences having premise in both reference and prediction: 498
	Sentences having premise in only one of reference or prediction: 170
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(11.4408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.9395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.8438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.8997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.3618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.2652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.9016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.1432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.1786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.7410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.2726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.0418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.1838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.6579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.0805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.1011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.4306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.6194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.9830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.3910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.2447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.1852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.8068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.9554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.8950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.0437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.9312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.4342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.7544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.7344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.0599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.5797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.1212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3.9540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.0867, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42215568862275454
Sentence level Krippendorff's alpha for Premises:  0.49401197604790414
Additional attributes: 
	Total Sentences: 668
	Prediction setences having claims: 254
	Prediction sentences having premises: 370
	Reference setences having claims: 225
	Reference sentences having premises: 313


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 127
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 180


	Sentences having claim in both reference and prediction: 475
	Sentences having claim in only one of reference or prediction: 193
	Sentences having premise in both reference and prediction: 499
	Sentences having premise in only one of reference or prediction: 169
				 Metric computations: None


		-------------RUN 2-----------
			------------EPOCH 1---------------
Loss:  tensor(3683.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2739.8467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2559.7124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1660.4485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.7927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1600.2986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2058.8081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2158.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.7112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1899.7756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1652.7786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1563.2681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1651.4202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.6129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1935.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1593.2629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1284.1865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1254.2257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2263.6548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.2473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2036.6353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.2197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.9866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1680.6273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1338.3816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1143.3683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1641.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1817.7937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(930.5741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.3826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1092.7366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1791.8936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2606.5859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1753.5389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2326.2148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1513.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2483.4707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1299.3862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2459.9919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2052.2673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.4824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1515.6852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.7843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.6178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.6680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1852.3044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1056.4110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.3519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1765.2424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.1300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1900.8110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2465.8906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2356.8416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.1824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1867.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2133.2322, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.25
Sentence level Krippendorff's alpha for Premises:  0.2904411764705882
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 247
	Prediction sentences having premises: 212
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 35
	Prediction Sentence having neither claim nor premise: 120
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 340
	Sentences having claim in only one of reference or prediction: 204
	Sentences having premise in both reference and prediction: 351
	Sentences having premise in only one of reference or prediction: 193
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2553.9004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1954.5210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1772.7131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1165.8187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.4088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1327.2632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.0043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.2537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.0034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1248.3101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1119.8839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1301.4625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.0797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1514.5912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1190.3413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1032.6953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.5311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1778.9525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(990.5410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1626.4614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1820.6746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.2632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1267.6685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1114.6515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.0558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1330.9790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1548.9335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.7438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.5747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1277.7288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2109.0049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1240.4792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1835.8191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1283.9091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1986.4229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.4371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2097.3987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1851.6744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.8517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1341.3594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.6816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.7266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.8291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(927.8253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1596.3861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.1919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.1394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1370.3257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.3867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1377.1204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2115.8914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2067.7351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.7084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1573.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1799.1472, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3529411764705882
Sentence level Krippendorff's alpha for Premises:  0.3713235294117647
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 279
	Prediction sentences having premises: 222
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 47
	Prediction Sentence having neither claim nor premise: 90
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 368
	Sentences having claim in only one of reference or prediction: 176
	Sentences having premise in both reference and prediction: 373
	Sentences having premise in only one of reference or prediction: 171
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(2165.5430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1656.8296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1400.9396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.2902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.3656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1295.1211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1011.0779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1252.1895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.3312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.9158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1035.1956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.7711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.5201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(995.2050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(886.0029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.4635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1447.7277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.5680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1309.3989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1549.0081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.6021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.7212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.6346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.2827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1052.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1324.8326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.0038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.4120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.6602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.3576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1684.9380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1006.0408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1551.8639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1647.1451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.3966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1740.5892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1455.7034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.3510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1081.0420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.9404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.3308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.9265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1494.6494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.6241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.8275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.2319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.4492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1223.8242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1759.9690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1692.7656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1202.6681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1763.6129, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38602941176470584
Sentence level Krippendorff's alpha for Premises:  0.36397058823529416
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 160
	Prediction sentences having premises: 136
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 21
	Prediction Sentence having neither claim nor premise: 269
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 377
	Sentences having claim in only one of reference or prediction: 167
	Sentences having premise in both reference and prediction: 371
	Sentences having premise in only one of reference or prediction: 173
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(2693.6411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1858.6598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1407.9253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.8162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.3766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.5452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1016.2082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.3029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(886.4454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.2404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.8315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.6178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.2444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.8499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1211.1814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(955.8929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.8271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.3701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1322.9604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.4567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.4346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2087.1853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(798.7815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1018.4802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(837.9960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.7153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1038.5386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1190.2656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.1356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.7825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1141.0549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2024.0319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(950.9102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1386.1183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.3768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1521.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.9102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1575.8776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.3645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.9540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.8979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.7974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.3799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.3906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.0909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.0886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.7877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.4728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(931.7240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.3571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1168.3696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1649.3140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1633.8210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.1892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.0784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.3308, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.375
Sentence level Krippendorff's alpha for Premises:  0.5073529411764706
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 159
	Prediction sentences having premises: 249
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 46
	Prediction Sentence having neither claim nor premise: 182
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 374
	Sentences having claim in only one of reference or prediction: 170
	Sentences having premise in both reference and prediction: 410
	Sentences having premise in only one of reference or prediction: 134
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1811.7681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1371.6106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1039.0878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(530.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.3771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1100.5840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.0985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(944.5205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(863.4036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.6443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(852.2328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.3804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.1594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.7309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.6349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.0058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1259.0892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.1959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.6217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1240.6257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.0904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.2087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.2638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.0526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.8751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.5310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.4860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.9178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.5658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.0465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1396.1893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.3923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.1458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1367.6740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(555.9769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.4534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1112.8264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.4108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.5319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.2450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.9018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.2787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.8107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.0148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.0136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.2922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.0110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.3361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1098.3691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1312.6747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1296.1730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.3374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.2349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1226.3706, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4338235294117647
Sentence level Krippendorff's alpha for Premises:  0.5441176470588236
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 193
	Prediction sentences having premises: 235
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 60
	Prediction Sentence having neither claim nor premise: 176
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 390
	Sentences having claim in only one of reference or prediction: 154
	Sentences having premise in both reference and prediction: 420
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1413.8826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.1815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.1375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.8588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.9731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.4146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1006.6101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(848.4696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.8993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.9740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.9015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.6489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.6547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.5016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.9121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.3690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.2369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1036.5730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.2686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(918.1740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.0781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.0303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.1940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.3500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(914.8827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.4657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.5221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.3226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.7776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1188.9391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.3098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.6547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.6976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.5371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.5996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.7482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.6038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.1652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.3547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.9851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.6667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.2531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1083.1427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.5691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.2035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.3615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.8902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1152.7772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1473.4167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.9824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.3837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(904.5757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.9187, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38602941176470584
Sentence level Krippendorff's alpha for Premises:  0.48897058823529416
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 302
	Prediction sentences having premises: 184
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 51
	Prediction Sentence having neither claim nor premise: 109
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 377
	Sentences having claim in only one of reference or prediction: 167
	Sentences having premise in both reference and prediction: 405
	Sentences having premise in only one of reference or prediction: 139
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(1266.2754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.1203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.5074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.3548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.1811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.9805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.8444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.0032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.3018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.7483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.8750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(475.4430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.5620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.6243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.1927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.2692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.4127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.6148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.6682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.9695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.4990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1019.0529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(546.9345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.4989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.7763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(578.5867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.7031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.2545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.5681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.5304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(979.9606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.6520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.2721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.9898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.6948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.0962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(788.7858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.7036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.1157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.4744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.8246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.0068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.4704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.8010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.5804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.6487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.4076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.0688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.5286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.7562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1701.3604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1471.5582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.4891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.8987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1511.3304, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4191176470588235
Sentence level Krippendorff's alpha for Premises:  0.4779411764705882
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 227
	Prediction sentences having premises: 315
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 68
	Prediction Sentence having neither claim nor premise: 70
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 386
	Sentences having claim in only one of reference or prediction: 158
	Sentences having premise in both reference and prediction: 402
	Sentences having premise in only one of reference or prediction: 142
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(2205.2898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1182.9973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.6379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.1565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.9545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.4514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.2485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.4047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(626.9218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.2052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.9796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.0376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.2385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.9451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.9949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.3852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.8499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.1398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(935.3713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.1791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.4418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(798.0997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.3349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.6802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.9227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.4262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(545.6403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.5958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.5248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.0358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.8297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1126.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.2460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1472.5371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1027.6271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.5664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.3954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1262.8932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(929.9982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.9407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(663.2872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.4272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.8628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.5583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.4807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.0670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.5806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.3080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.0651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.2057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.2019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.5574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.9474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.1009, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42647058823529416
Sentence level Krippendorff's alpha for Premises:  0.4007352941176471
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 185
	Prediction sentences having premises: 362
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 67
	Prediction Sentence having neither claim nor premise: 64
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 388
	Sentences having claim in only one of reference or prediction: 156
	Sentences having premise in both reference and prediction: 381
	Sentences having premise in only one of reference or prediction: 163
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(1254.4960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(878.0765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.4934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.0561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.2868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.7280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.7363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.7424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(863.4471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(636.2067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.3452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.2607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.2569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.0389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.2899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.4758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.5090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(735.8023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.2180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.0870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1150.2574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.7872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(522.3896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.7563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.9653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.7571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.3120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.8756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.2731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.2606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.4860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.9515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(889.8917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.0821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.5016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.5399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(433.5806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.2978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.7743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.9058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.6447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.8228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.0585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.0598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.4540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.5060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.7318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1354.7576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.8384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.7167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(705.9127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1048.2410, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3786764705882353
Sentence level Krippendorff's alpha for Premises:  0.3933823529411765
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 66
	Prediction sentences having premises: 388
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 22
	Prediction Sentence having neither claim nor premise: 112
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 375
	Sentences having claim in only one of reference or prediction: 169
	Sentences having premise in both reference and prediction: 379
	Sentences having premise in only one of reference or prediction: 165
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(1452.4978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.3500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.4696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.2214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.9092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.8108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.9817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.0051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.0027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.7547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.6526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.9691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.3165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.5642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.6306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.6985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.0156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.6311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.9160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.1414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.8406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.0269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.6215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.0482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.4905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.5191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.3942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.9735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.6227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.9380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.5648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1205.6697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.1719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1539.3834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(959.9993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1540.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.5594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.9351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.1125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.5662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.9714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.1670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.2303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.1141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.0447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.1782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.2717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.3391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.1470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.1107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.4994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(745.5156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.2942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.6808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.4058, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3897058823529411
Sentence level Krippendorff's alpha for Premises:  0.3492647058823529
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 63
	Prediction sentences having premises: 424
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 29
	Prediction Sentence having neither claim nor premise: 86
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 378
	Sentences having claim in only one of reference or prediction: 166
	Sentences having premise in both reference and prediction: 367
	Sentences having premise in only one of reference or prediction: 177
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(1177.3267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(822.4259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.6996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.5161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.5531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.4632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.8016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.5748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.7419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.3011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(931.1860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(830.5613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.3424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.5277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1588.7141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1338.3843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.5131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.1500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(798.4297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.4663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.2725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.1090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.8400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.8899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.0253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.2564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.8924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(487.0223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.2626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.9036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.8174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.5701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.4070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.8851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.0226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.8569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.8654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.6070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.4928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.0390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.0712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.1542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.7628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.9507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.6582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.6427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.2772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.4883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.6009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.2681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.3109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.2592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1384.6099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.2298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.6654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.2687, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43014705882352944
Sentence level Krippendorff's alpha for Premises:  0.4558823529411765
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 284
	Prediction sentences having premises: 229
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 71
	Prediction Sentence having neither claim nor premise: 102
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 389
	Sentences having claim in only one of reference or prediction: 155
	Sentences having premise in both reference and prediction: 396
	Sentences having premise in only one of reference or prediction: 148
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(951.2899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.6602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.8783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.8875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.9084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.9407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.3207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.4649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.1601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.9761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.6621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.9063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.1542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.7410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.1316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.8927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.1832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.3925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.5059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.7626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.3518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.5046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.3225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.3105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.8093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.6674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.9882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.3096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.8199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.8052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.0876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.5756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.7233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.5276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.3638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.2116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.7640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.9322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.6769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.9665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.0621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.2069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.6512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.1862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.3104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.3367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.0228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(636.4513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.3896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.1075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.4601, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4632352941176471
Sentence level Krippendorff's alpha for Premises:  0.5257352941176471
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 259
	Prediction sentences having premises: 280
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 87
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 398
	Sentences having claim in only one of reference or prediction: 146
	Sentences having premise in both reference and prediction: 415
	Sentences having premise in only one of reference or prediction: 129
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(617.9581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.3751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.3938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.9228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.3319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.3121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.5842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.3087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.9913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.4862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.8738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.3608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.6178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.9879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.7246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.4885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.6208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.2868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.1698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.8898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.5539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.9991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.1573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.0553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.5397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.1226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.1765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.9176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.3407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.6924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.1242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.0975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.0658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.8896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.1233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.6765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.7729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.9392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.3023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.4642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.5839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.5351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.4398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.4811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.5979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.0523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.0460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.7875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.4200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.2570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.0695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.6527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.1171, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4595588235294118
Sentence level Krippendorff's alpha for Premises:  0.5257352941176471
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 222
	Prediction sentences having premises: 284
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 60
	Prediction Sentence having neither claim nor premise: 98
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 397
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 415
	Sentences having premise in only one of reference or prediction: 129
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(540.8420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.5520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.9637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.6794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.2090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.2652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.0581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.3430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.0856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.6554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.0940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.7840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.1048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.2333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.1993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.5269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.1439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.8519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.4805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.3717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.5172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.0871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.7484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.8184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.7648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.5683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.3108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.1827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.1176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.9422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.2737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.0683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.2037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.2262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.0667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.7157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.5744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.5653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.2831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.5841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.1128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.6957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.7153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.1442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.7463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.2117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.3283, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4558823529411765
Sentence level Krippendorff's alpha for Premises:  0.5294117647058824
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 233
	Prediction sentences having premises: 281
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 64
	Prediction Sentence having neither claim nor premise: 94
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 396
	Sentences having claim in only one of reference or prediction: 148
	Sentences having premise in both reference and prediction: 416
	Sentences having premise in only one of reference or prediction: 128
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(462.8505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.5243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.7172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.7738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.2529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.1310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.6875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.9391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.4925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.4112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.9861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.8663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.7248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.7513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.3637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.4110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.0277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.9598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.9945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.1660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.4799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.2902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.4109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.4541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.2955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.5750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.8268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.8912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.8495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.7244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.7621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.5695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.1952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.0806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.8578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.6927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.6421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.6713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.0968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.2890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.0452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.2744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.7545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.6011, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4595588235294118
Sentence level Krippendorff's alpha for Premises:  0.5367647058823529
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 230
	Prediction sentences having premises: 283
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 94
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 397
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 418
	Sentences having premise in only one of reference or prediction: 126
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(420.1526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.3425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.5460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.9846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.5978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.4513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.5704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.7655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.3403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.2565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.0010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.8438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.1736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.6994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.9589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.9364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.3484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.7034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.2630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.2472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.5666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.3156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.8157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.0418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.8540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.8183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.4892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.8084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.1990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.7381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.8660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.1421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.2192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.6434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.0821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.1166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.4132, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4595588235294118
Sentence level Krippendorff's alpha for Premises:  0.5441176470588236
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 226
	Prediction sentences having premises: 289
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 62
	Prediction Sentence having neither claim nor premise: 91
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 397
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 420
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(403.5207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.6393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.0727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.5212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.0947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.9399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.4176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.0650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.5846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.1686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.8317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.3939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.3069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.8336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.3200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.8791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.7117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.3874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.8606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.4063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.0520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.2684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.3418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.5690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.7892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.9958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.9159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.9854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.4816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.8803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.8212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.9398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.3662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.8040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.7248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.5316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.3984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.4801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.5014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.1655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.8128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.5049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.4351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.3248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.9272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.7843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.1519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.3379, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44852941176470584
Sentence level Krippendorff's alpha for Premises:  0.5367647058823529
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 237
	Prediction sentences having premises: 273
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 97
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 394
	Sentences having claim in only one of reference or prediction: 150
	Sentences having premise in both reference and prediction: 418
	Sentences having premise in only one of reference or prediction: 126
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(410.8398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.8340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.0908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.9873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.1409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.1698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.9389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.0682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.0916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.5279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.5622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.7455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.3931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.8653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.6846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.3629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.9757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.1516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.6257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.0148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.3409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.7540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.3838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.5091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.5581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.0627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.0042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.5074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.2868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.9026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.8073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.1729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.2359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.9828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.6513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.2887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.4826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.2622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.4106, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4595588235294118
Sentence level Krippendorff's alpha for Premises:  0.5514705882352942
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 210
	Prediction sentences having premises: 291
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 54
	Prediction Sentence having neither claim nor premise: 97
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 397
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 422
	Sentences having premise in only one of reference or prediction: 122
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(345.4468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.7293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.1209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.2434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.7516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.6732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.5061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.7969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.1311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.4286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.9086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.7245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.9150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.5972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.6797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.0190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.1728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.5466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.4549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.4214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.5295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.7983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.8003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.8678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.5814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.0078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.6803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.8921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.7746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.3380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.8507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.5543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.4019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.4935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.3002, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4595588235294118
Sentence level Krippendorff's alpha for Premises:  0.5367647058823529
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 242
	Prediction sentences having premises: 281
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 69
	Prediction Sentence having neither claim nor premise: 90
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 397
	Sentences having claim in only one of reference or prediction: 147
	Sentences having premise in both reference and prediction: 418
	Sentences having premise in only one of reference or prediction: 126
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(315.9788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.7827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.7454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.6237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.0548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.5732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.8799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.7554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.1510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.5171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.0905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.0943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.4146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.1366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.3554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.6539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.2725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.0819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.3738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.5133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.7215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.4601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.8766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.3730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.7433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.1173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.2838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.7430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.8404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.3403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.4353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.2691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.1890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.1500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.2849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.4549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.6805, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46691176470588236
Sentence level Krippendorff's alpha for Premises:  0.5441176470588236
Additional attributes: 
	Total Sentences: 544
	Prediction setences having claims: 226
	Prediction sentences having premises: 289
	Reference setences having claims: 211
	Reference sentences having premises: 255


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 92
	Reference Sentence having both claim and premise: 60
	Reference Sentence having neither claim nor premise: 138


	Sentences having claim in both reference and prediction: 399
	Sentences having claim in only one of reference or prediction: 145
	Sentences having premise in both reference and prediction: 420
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None


		-------------RUN 3-----------
			------------EPOCH 1---------------
Loss:  tensor(2800.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2118.8706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1451.7938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1747.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2706.3037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2411.2793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1595.1091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1819.8997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.6128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1392.4364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1802.9246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3917.6538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2364.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2354.3833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2047.2009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1664.1064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1686.3914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.3967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1327.7758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1553.5533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1982.7732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.4204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1828.4038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1647.8940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2509.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1150.5493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1658.3827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1940.3926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1640.6942, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2429.6326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2128.2612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2186.5964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2292.3674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2060.0200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1774.8497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.2689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1957.7617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2008.4448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1474.8644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1674.7660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2354.2124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2017.9058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1539.3973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1101.3557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1584.8660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1730.6489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.4673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1973.4203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.1592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.5293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(787.5585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.5090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.9409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.9345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2308.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.4792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.8862, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3415007656967841
Sentence level Krippendorff's alpha for Premises:  0.32924961715160794
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 142
	Prediction sentences having premises: 374
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 72
	Prediction Sentence having neither claim nor premise: 209
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 438
	Sentences having claim in only one of reference or prediction: 215
	Sentences having premise in both reference and prediction: 434
	Sentences having premise in only one of reference or prediction: 219
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1965.9642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1458.5292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.0388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1065.6682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2207.4517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1710.1764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1238.6715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1471.2668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.8658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1033.1500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1387.1256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3463.2090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2059.6191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1908.3466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1661.4646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1464.7947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1477.6924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1027.7939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.3159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1144.4546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1633.9259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1624.8521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1468.0776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1862.5505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(846.0382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1171.2463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1437.1025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1373.7188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2119.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1853.3978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1910.6287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1934.2786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1640.4446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1502.3447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(932.2960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1430.9221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1581.9274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1208.9512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.7279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2013.9512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.6729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1199.9072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.8241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(951.5664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.5800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.1808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.4894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.8577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.1838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.6853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.2815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1982.0115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.1945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1515.0901, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3996937212863706
Sentence level Krippendorff's alpha for Premises:  0.3200612557427259
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 125
	Prediction sentences having premises: 375
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 48
	Prediction Sentence having neither claim nor premise: 201
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 457
	Sentences having claim in only one of reference or prediction: 196
	Sentences having premise in both reference and prediction: 431
	Sentences having premise in only one of reference or prediction: 222
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1682.7511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1274.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.4096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1883.2988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1364.4869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.5125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1184.7618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.8728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(837.3639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.6493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2915.9087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1672.4114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1673.5741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.6606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.1941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1285.9691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(885.5256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.2589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.8591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1464.2617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.4525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1375.9668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1239.4904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.7393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.2839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.6019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1093.0271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.8042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1702.6406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1540.7910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1704.9155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1652.4197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1354.0940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.9335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.8519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1119.9753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1368.2025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.3489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1107.8643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1737.1228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1480.3207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.9598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.4219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.9609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.3025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.9974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1205.0148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.4868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.7261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.4123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.3699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(777.3618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(555.8136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1693.4321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.9426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1215.8171, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46401225114854516
Sentence level Krippendorff's alpha for Premises:  0.38744257274119454
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 154
	Prediction sentences having premises: 359
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 45
	Prediction Sentence having neither claim nor premise: 185
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 478
	Sentences having claim in only one of reference or prediction: 175
	Sentences having premise in both reference and prediction: 453
	Sentences having premise in only one of reference or prediction: 200
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1381.9307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.9969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.8589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.6134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.1593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.8049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.2023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(906.1660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(920.6462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.9026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.2328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2440.3435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.7532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1434.6736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.4343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.3843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1080.1135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.1732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.0210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.0042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1270.0541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.0924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.5996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1045.5831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1052.2439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.2934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.8307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.7184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(880.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1423.6929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.4731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1481.9922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1399.9753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.5020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.0624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.3142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(880.2258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.0076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.0759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(904.4243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1456.0928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1241.7844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(756.8390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.5474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(740.4050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(841.2496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.5675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.2300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.7807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.9490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.1904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.3389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.6438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.0525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1493.7754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.2308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.2690, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4578866768759571
Sentence level Krippendorff's alpha for Premises:  0.3721286370597243
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 114
	Prediction sentences having premises: 410
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 41
	Prediction Sentence having neither claim nor premise: 170
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 476
	Sentences having claim in only one of reference or prediction: 177
	Sentences having premise in both reference and prediction: 448
	Sentences having premise in only one of reference or prediction: 205
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1086.6362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(852.8383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.7488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.6244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1115.5063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.7272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.9103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.5607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.1876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.3252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.7635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2057.1323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(947.8691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.4971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.1138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.6484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.0068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.4399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.7468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.4515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1146.0367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.5507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(925.3170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.7945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.9738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.1137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.2578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.2440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.7440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.6432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1258.9910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1235.3804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.8336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.7700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.7278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.7346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(750.5991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.0395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1172.7522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.6040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.2326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.5715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.6122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.8417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.7373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.7521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.4262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.0614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.0272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.1696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.0131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.2433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1358.7070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.9736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.8656, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.43338437978560485
Sentence level Krippendorff's alpha for Premises:  0.36600306278713635
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 80
	Prediction sentences having premises: 406
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 27
	Prediction Sentence having neither claim nor premise: 194
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 468
	Sentences having claim in only one of reference or prediction: 185
	Sentences having premise in both reference and prediction: 446
	Sentences having premise in only one of reference or prediction: 207
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1107.6005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.1588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.5925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.1040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(918.0351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.4106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.6669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.7120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.9643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.0463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.8316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2025.2473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.0337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1089.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.4709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.9493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.9672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.9470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.4550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.9916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.5490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.1959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.8390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.5118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.0880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.0197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.7875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.1606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1088.1560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(935.9476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1319.9259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.0873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(950.2946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.6676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(552.4310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.7413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.5530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(498.1766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1336.4858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1175.6685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.7801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.8672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.3411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.5766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.1544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.0248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.6694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.2882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.2030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.5829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(546.6868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.9219, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.6859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.9758, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3721286370597243
Sentence level Krippendorff's alpha for Premises:  0.40275650842266464
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 50
	Prediction sentences having premises: 300
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 17
	Prediction Sentence having neither claim nor premise: 320
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 448
	Sentences having claim in only one of reference or prediction: 205
	Sentences having premise in both reference and prediction: 458
	Sentences having premise in only one of reference or prediction: 195
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(1352.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.9505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.6464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.4558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1133.0681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.5308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.2649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.7421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.8406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.5839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.1447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2107.1074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.6593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.1476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(864.6584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.1155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(777.8489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(555.3873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.6541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.1655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.4656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.5871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.0897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(765.1913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.6014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.6791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.8723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.4425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.2905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.5341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.1063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1211.0485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1209.9478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.3818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.3420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.9857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.8650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1160.9851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.9535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2026.7211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1827.5674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.8385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.6007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.5121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.9439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1046.7278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.1557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.8292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.3663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.4200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.9586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.2138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.7199, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48545176110260335
Sentence level Krippendorff's alpha for Premises:  0.4211332312404288
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 119
	Prediction sentences having premises: 338
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 43
	Prediction Sentence having neither claim nor premise: 239
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 485
	Sentences having claim in only one of reference or prediction: 168
	Sentences having premise in both reference and prediction: 464
	Sentences having premise in only one of reference or prediction: 189
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(749.8129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.3711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(395.1187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.5545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.8047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(628.9913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.2717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.5687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.6183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.5918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2142.0537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.3638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.6086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.5704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.5486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.3003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.5272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1002.1476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1657.9225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.4348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.6172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(927.5988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1015.5017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.9832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.1017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(786.1967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.0004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.5768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.4900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1803.6073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1316.8420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1025.4119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.2661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.7745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.1227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.8913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.6964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.1761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.9353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1006.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.6394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.0156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(433.4674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.2330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.9447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(745.9371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.6501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.6596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.7803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.4845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(733.0932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.2703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1819.3179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.2209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.6833, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.19142419601837668
Sentence level Krippendorff's alpha for Premises:  0.3568147013782542
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 481
	Prediction sentences having premises: 105
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 42
	Prediction Sentence having neither claim nor premise: 109
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 389
	Sentences having claim in only one of reference or prediction: 264
	Sentences having premise in both reference and prediction: 443
	Sentences having premise in only one of reference or prediction: 210
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(1190.4504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.5583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(448.9033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.2671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.4505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.5874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.0903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.1443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.1305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(523.3831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1402.9591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.8187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.9018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.9362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.4006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.2957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.4490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.5557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.0210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.1982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.7930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.0118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.9550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.0951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.9030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.9542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.5823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.2844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(986.2921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.5078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.4364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.5063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.2358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.1178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.0319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.8677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(824.7711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.3393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.9777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.7029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.1031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.1361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.0013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.2534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.4882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.5184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.7758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.7656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.2459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.4455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(792.1334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.3583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.7397, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.47626339969372133
Sentence level Krippendorff's alpha for Premises:  0.4211332312404288
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 264
	Prediction sentences having premises: 188
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 53
	Prediction Sentence having neither claim nor premise: 254
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 482
	Sentences having claim in only one of reference or prediction: 171
	Sentences having premise in both reference and prediction: 464
	Sentences having premise in only one of reference or prediction: 189
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(571.5641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.1601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.4843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.8836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.9488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.3018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.0604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.0480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.9298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.9468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1379.7673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.9496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.4598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.7358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.7827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.5211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.6545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.5846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.2141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.0589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.4460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.0963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.4303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.2886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.9726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.2804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.1948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.5133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.7321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.7095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.3708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.0462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.9448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.7090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.7882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.4012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.5784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(979.4159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.8699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.3787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.1976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.6048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.6877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.6055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.4803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.6747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.7952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.3191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.6700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.8570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.9628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.5116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.3022, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46401225114854516
Sentence level Krippendorff's alpha for Premises:  0.5834609494640122
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 310
	Prediction sentences having premises: 297
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 129
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 478
	Sentences having claim in only one of reference or prediction: 175
	Sentences having premise in both reference and prediction: 517
	Sentences having premise in only one of reference or prediction: 136
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(492.0135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.2250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.1548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.5795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.6921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.1094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.5941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.4444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.5650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.3667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.5686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(747.2173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.3289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.2841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.3218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.5884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.4526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.9518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.0399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.0276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.2888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.4756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.5323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.3048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.5509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.6869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.9593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.5117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.0407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.2106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.3566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.1722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.9538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.7344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.8724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.2973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.0617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.5521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.7982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.4648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.9758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.2111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.8916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.7037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.9690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.9873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.7881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.9372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.0164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.2577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.1272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.3027, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4732006125574273
Sentence level Krippendorff's alpha for Premises:  0.43950995405819293
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 231
	Prediction sentences having premises: 406
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 97
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 481
	Sentences having claim in only one of reference or prediction: 172
	Sentences having premise in both reference and prediction: 470
	Sentences having premise in only one of reference or prediction: 183
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(640.9346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.9829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.8630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.7423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.2210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.6749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.0256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.4000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.9867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.1997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.6118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1301.8804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.1924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.3508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.4575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.4800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.7543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.3186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.7646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.3300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.1920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.3384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.9297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.1338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.1353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.0762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.3604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.3702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.6255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.7704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.8586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.3931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.7004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.4354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.6456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.6657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.6356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.6658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.8915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(581.0454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.2971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.6852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.4043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.3166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.1893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.7513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.4913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.4268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.4480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.2367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.2899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.3945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.0055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.8057, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4732006125574273
Sentence level Krippendorff's alpha for Premises:  0.5620214395099541
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 293
	Prediction sentences having premises: 274
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 78
	Prediction Sentence having neither claim nor premise: 164
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 481
	Sentences having claim in only one of reference or prediction: 172
	Sentences having premise in both reference and prediction: 510
	Sentences having premise in only one of reference or prediction: 143
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(308.1643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.7348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.4656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.3816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.8683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.1080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.5306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.4760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.0397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.1964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.1522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.2101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.1252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.8103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.3776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.0661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.2202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.1109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.5529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.8676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.9540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.7844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(321.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.4367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.5018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.6645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.4093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.1346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.2090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.6937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.9431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.3271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.5792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.4578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.8238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.0002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.4977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.3131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.8110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.2598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.7554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.2171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.6498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.6567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.3488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.6961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.2767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.9099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.3686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.9628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.6248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.1391, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4946401225114855
Sentence level Krippendorff's alpha for Premises:  0.555895865237366
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 282
	Prediction sentences having premises: 308
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 76
	Prediction Sentence having neither claim nor premise: 139
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 488
	Sentences having claim in only one of reference or prediction: 165
	Sentences having premise in both reference and prediction: 508
	Sentences having premise in only one of reference or prediction: 145
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(241.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.4864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.2964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.9114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.6734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.4706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.7074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.7954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.6155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.9971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.1337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.0501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.3173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.3048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.9902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.5164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.1845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.6680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.8917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.1993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.4373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.0640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.1662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.1351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.3047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.0766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.5084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.6662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.2215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.8003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.8845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.1449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.1972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.8696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.2486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.7482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.3185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.5630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.7356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.8218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.6067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.6586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.2253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.2302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.4776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.2099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.3624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.3268, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5222052067381318
Sentence level Krippendorff's alpha for Premises:  0.5283307810107197
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 233
	Prediction sentences having premises: 341
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 62
	Prediction Sentence having neither claim nor premise: 141
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 497
	Sentences having claim in only one of reference or prediction: 156
	Sentences having premise in both reference and prediction: 499
	Sentences having premise in only one of reference or prediction: 154
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(179.7384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.5921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.5341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.2820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.2932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.3116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.2365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.2251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(627.1953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.1977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.7212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.9714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.7143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.0148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.4895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.6011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.1125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.7929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.1217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.3256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.3864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.7326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.6411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.4593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.9003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.0546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.6371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.3078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.0080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.2439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.3778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.9155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.3571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.6619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.4382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.7586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.5382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.7313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.4197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.7186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.0710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.0667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.6063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1903, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4823889739663093
Sentence level Krippendorff's alpha for Premises:  0.5467075038284839
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 294
	Prediction sentences having premises: 295
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 145
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 484
	Sentences having claim in only one of reference or prediction: 169
	Sentences having premise in both reference and prediction: 505
	Sentences having premise in only one of reference or prediction: 148
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(166.3920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.8187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.3779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.6809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.3047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.3802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.6976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.8323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.5297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.8680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.4624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.3863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.7585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.1490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.9524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.4259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.0168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.5746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.1354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.3852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.7071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.1877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.3674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.1878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.7834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.5271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.9508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.8883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.3154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.9546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.9898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.8032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.0780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.9132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.2231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.8865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.8877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.0446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.3471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.3041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.4390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.6191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.7013, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4977029096477795
Sentence level Krippendorff's alpha for Premises:  0.5283307810107197
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 237
	Prediction sentences having premises: 339
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 67
	Prediction Sentence having neither claim nor premise: 144
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 489
	Sentences having claim in only one of reference or prediction: 164
	Sentences having premise in both reference and prediction: 499
	Sentences having premise in only one of reference or prediction: 154
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(225.0208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.6159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.4860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.9958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.6923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.7421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.8266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.7584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.9540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.5051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.9818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.1734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.0830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.2597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.5551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.7141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.2329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.7472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.8302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.3658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.3035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.6464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.0652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.6836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.3062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.0698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.9222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.2504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.3541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.0931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.6570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.8592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.6055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.8260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.8765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.9648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.3573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.9546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.4217, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48545176110260335
Sentence level Krippendorff's alpha for Premises:  0.5130168453292496
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 237
	Prediction sentences having premises: 346
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 68
	Prediction Sentence having neither claim nor premise: 138
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 485
	Sentences having claim in only one of reference or prediction: 168
	Sentences having premise in both reference and prediction: 494
	Sentences having premise in only one of reference or prediction: 159
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(203.9822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.4014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.9871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.4061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.1184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.8349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.1589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.7933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.0570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.3319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.0281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.0326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.6957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.4279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.9577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.4382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.5427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.1375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.8698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.1364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.8863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.6840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.3066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.0280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.0891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.9608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.2788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.0723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.4182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.5437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.4533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.5705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.5361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.7545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.6930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.7409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.6339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.6066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.2203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.0919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.6207, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.45482388973966315
Sentence level Krippendorff's alpha for Premises:  0.5313935681470138
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 285
	Prediction sentences having premises: 306
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 77
	Prediction Sentence having neither claim nor premise: 139
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 475
	Sentences having claim in only one of reference or prediction: 178
	Sentences having premise in both reference and prediction: 500
	Sentences having premise in only one of reference or prediction: 153
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(136.3605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.4598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.8662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.4964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.4273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.6443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.1057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.1204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.5930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.1188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.7701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.0507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.8970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.7583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.8739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.1378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.5217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.2893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.6792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.3494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.9627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.1754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.5728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.1062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.3824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.6599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.8928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.5171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.0938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.0675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.6174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.5028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4.8005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.2970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.9534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.7732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.9960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.8858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.7729, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5099540581929556
Sentence level Krippendorff's alpha for Premises:  0.5038284839203675
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 225
	Prediction sentences having premises: 359
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 69
	Prediction Sentence having neither claim nor premise: 138
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 493
	Sentences having claim in only one of reference or prediction: 160
	Sentences having premise in both reference and prediction: 491
	Sentences having premise in only one of reference or prediction: 162
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(110.7262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.1511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.6504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.3379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.7202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.8553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.6313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.4348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.4440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.8746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.1469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.9347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.9956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.3693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.3011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.2782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.9326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.0997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.2460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.4450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4.7405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.2338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.7681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.0596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.3321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.0057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.6416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.0816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.2285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.1212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.6824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.0397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.3290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.2288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.9065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.3819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.0407, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.5834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.9139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.8793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.4480, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5068912710566615
Sentence level Krippendorff's alpha for Premises:  0.5344563552833078
Additional attributes: 
	Total Sentences: 653
	Prediction setences having claims: 258
	Prediction sentences having premises: 337
	Reference setences having claims: 243
	Reference sentences having premises: 275


	Prediction Sentence having both claim and premise: 77
	Prediction Sentence having neither claim nor premise: 135
	Reference Sentence having both claim and premise: 63
	Reference Sentence having neither claim nor premise: 198


	Sentences having claim in both reference and prediction: 492
	Sentences having claim in only one of reference or prediction: 161
	Sentences having premise in both reference and prediction: 501
	Sentences having premise in only one of reference or prediction: 152
				 Metric computations: None


		-------------RUN 4-----------
			------------EPOCH 1---------------
Loss:  tensor(2241.6638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2236.1321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2048.7700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.8745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1140.7869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.4224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1754.7727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1605.7499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1248.3373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1314.0200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.4094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2208.3042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2036.5360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2377.3555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1992.2375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1981.4331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1348.7673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2359.6121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1460.9062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2449.8352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.3939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2253.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2045.1226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.9629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2084.8928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2492.8486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2304.2502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1496.7959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(980.7849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1526.0424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1834.8033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1817.5122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1347.3142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1771.0298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.4473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.3555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2466.1177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1113.9570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(888.6925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.3253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1729.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1914.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.3464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.0893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1424.1395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.8430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1575.7036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.6004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1547.1711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.0446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.9786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1098.0629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.5940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1366.6439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.5110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1273.4202, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2228116710875332
Sentence level Krippendorff's alpha for Premises:  0.3687002652519894
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 193
	Prediction sentences having premises: 375
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 17
	Prediction Sentence having neither claim nor premise: 203
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 461
	Sentences having claim in only one of reference or prediction: 293
	Sentences having premise in both reference and prediction: 516
	Sentences having premise in only one of reference or prediction: 238
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1306.1447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1434.2563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1434.9238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.6086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.3135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.9890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.5823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1221.1305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.1393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.1145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.6965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2053.6411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1656.8290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1989.9176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1790.0414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1559.9412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.3782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1832.6869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.1249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.6807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2028.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(940.3731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1903.6179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.8447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.7916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1827.6104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2065.1860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1920.2114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1289.6462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.3656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1318.6820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1602.2212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1628.3315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.8650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1391.5953, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.9427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1396.8877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2062.9780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(927.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.9009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.2347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1448.6875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1639.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.1259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.1824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(729.7711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1374.0392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.6428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.9781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.4303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1539.6431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(860.0787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.0365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.1865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.8976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1072.9692, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.22015915119363394
Sentence level Krippendorff's alpha for Premises:  0.3952254641909815
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 358
	Prediction sentences having premises: 267
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 34
	Prediction Sentence having neither claim nor premise: 163
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 460
	Sentences having claim in only one of reference or prediction: 294
	Sentences having premise in both reference and prediction: 526
	Sentences having premise in only one of reference or prediction: 228
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1141.0282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.4524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1250.7119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.0754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.8326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.3309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1127.4988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1083.3741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.4938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.8015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1016.9327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1611.1274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1395.0198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1611.4750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1458.4114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1254.1055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(880.9392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1532.9170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(833.0532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.9637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1725.2858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.6241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1583.8212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1239.9440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.6821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1492.9890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1736.9622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1672.5520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(837.2556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.5586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.2495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.4183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1382.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.4008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1075.1915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.3895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1137.4381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1770.5049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.9207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.0938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.5195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.6229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.7382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.7729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.3964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.2964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(607.3774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1216.1967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.6217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.4734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.4060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1267.8804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.5228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.3498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.5884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.9648, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.23607427055702923
Sentence level Krippendorff's alpha for Premises:  0.37665782493368705
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 424
	Prediction sentences having premises: 240
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 53
	Prediction Sentence having neither claim nor premise: 143
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 466
	Sentences having claim in only one of reference or prediction: 288
	Sentences having premise in both reference and prediction: 519
	Sentences having premise in only one of reference or prediction: 235
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(952.4501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.5807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.9725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(555.1312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.0496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.1443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.3794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(991.1265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.5514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(718.0833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.0172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1248.7780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.8656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1343.8616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.9956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.1959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(705.5408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.2672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.5055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.1901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1384.8527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.0818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1286.7997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.3995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1414.6005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1365.4561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.0347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.7570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.1051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.9273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.9084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.0434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.3495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.7208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.4090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1447.9509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(510.8055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.7361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.2386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.9579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1195.7737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.9914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.3622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.0985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.7353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.5078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.8047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.9789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.5369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.6391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.0713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.6338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.8921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.8041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.4087, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33687002652519893
Sentence level Krippendorff's alpha for Premises:  0.43236074270557034
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 390
	Prediction sentences having premises: 271
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 63
	Prediction Sentence having neither claim nor premise: 156
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 504
	Sentences having claim in only one of reference or prediction: 250
	Sentences having premise in both reference and prediction: 540
	Sentences having premise in only one of reference or prediction: 214
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(767.0043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(791.8737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.9625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.8702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.5116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.2486, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(698.7032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.0295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.2690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.2299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.7253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1033.8750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.5092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.6339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(691.3016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.5760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.0498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.2094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.0089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1106.0791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.9179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(892.9341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.8203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.0185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(910.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1122.2716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1107.8483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.6320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.7973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.2381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(663.5955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(888.7520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.9502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(583.9574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.6698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(705.9947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.0837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.6462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.1265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.1293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.7296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(864.4871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.7100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(686.4297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.5883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.7784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.4540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.7132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.1965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.1827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.7500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.5410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.2097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.9863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.7468, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3448275862068966
Sentence level Krippendorff's alpha for Premises:  0.4482758620689655
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 339
	Prediction sentences having premises: 373
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 72
	Prediction Sentence having neither claim nor premise: 114
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 507
	Sentences having claim in only one of reference or prediction: 247
	Sentences having premise in both reference and prediction: 546
	Sentences having premise in only one of reference or prediction: 208
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(694.4553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.5696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.2074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.0428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.0887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.1958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.2411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.6499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.8648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.7537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.9800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.8125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(823.0549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(869.8778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.3642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.4844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.0833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.1500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.0851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.6984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(837.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.5796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(644.4460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.7900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.6523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.3835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.0874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.4583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.0689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.4333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.7364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.7283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.3687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.7658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.5975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.9567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.9949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.7317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.8039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.4264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.0817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1636.7495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1700.8220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.6865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.6742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.9844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(342.8252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.7461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.1569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.1740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.5882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(727.0616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.0847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.6977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.6611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.4888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.3567, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2891246684350133
Sentence level Krippendorff's alpha for Premises:  0.42705570291777184
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 364
	Prediction sentences having premises: 353
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 69
	Prediction Sentence having neither claim nor premise: 106
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 486
	Sentences having claim in only one of reference or prediction: 268
	Sentences having premise in both reference and prediction: 538
	Sentences having premise in only one of reference or prediction: 216
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(591.0031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.4244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.4079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.8268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.9274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.6437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.5698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.2075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.5037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.1648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(604.1624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.7716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.8511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.6630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.8698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(894.6292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.2630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.6041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.8110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.7010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.8893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.8713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.3879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.1223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.9384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.1782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.2323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.4126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.9772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.9047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.4073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.4011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.8733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.0992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1295.7529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.1837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.8200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.4149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1510.0679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1534.7214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.9615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.2439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.6084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.5904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(948.1333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.8195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.2812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.8698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(795.3934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.4573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(671.2922, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.8983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.5470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(869.5217, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.09814323607427056
Sentence level Krippendorff's alpha for Premises:  0.3183023872679045
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 548
	Prediction sentences having premises: 106
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 21
	Prediction Sentence having neither claim nor premise: 121
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 414
	Sentences having claim in only one of reference or prediction: 340
	Sentences having premise in both reference and prediction: 497
	Sentences having premise in only one of reference or prediction: 257
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(762.7712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.0878, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.2497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.8442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.2830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.5402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.3167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.0018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.1882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.6363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.5248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.2462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.0872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.6567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.9016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.2864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.8393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.6324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.5393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.2710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.9480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.1039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.3994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(712.1935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1015.1343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.7717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.5094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.9888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.9610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.5356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.2450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.2535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.5899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.4089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.6541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.8844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.2457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.2510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.6297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.0798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.0417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.9320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.3341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.9005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.3497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.6990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.3504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.6036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.2499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.3409, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.28116710875331563
Sentence level Krippendorff's alpha for Premises:  0.35013262599469497
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 475
	Prediction sentences having premises: 146
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 57
	Prediction Sentence having neither claim nor premise: 190
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 483
	Sentences having claim in only one of reference or prediction: 271
	Sentences having premise in both reference and prediction: 509
	Sentences having premise in only one of reference or prediction: 245
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(520.8544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.5480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.6198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.4045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.0533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.3997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.1351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.6075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.8240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.2208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.1000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.5428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(660.1709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.4609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.9668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.5998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.3261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.2531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.7214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.5235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.3337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.6101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.8205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(635.2634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(768.5640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.8312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.7334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.5214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.9480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.3542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.3842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.7006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.5817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.6432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.7396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.6940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.9450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.8334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.8753, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.6360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.6346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.2749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.3717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(597.4798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.6086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.0948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.6589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.7904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.1898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.5777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.2218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.0657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.8569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.3589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.5224, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3846153846153846
Sentence level Krippendorff's alpha for Premises:  0.3023872679045093
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 284
	Prediction sentences having premises: 160
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 45
	Prediction Sentence having neither claim nor premise: 355
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 522
	Sentences having claim in only one of reference or prediction: 232
	Sentences having premise in both reference and prediction: 491
	Sentences having premise in only one of reference or prediction: 263
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(708.5549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(824.8267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.3126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.7115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.8982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.6351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.6678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.5713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.6880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.4613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.2551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.8374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.4097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.7027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.9634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.2167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.7501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.2101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.2261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.3883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(807.4451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.5656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.4690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.5190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.6466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.5734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(892.1876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.7293, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.3675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.9930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.1948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.7864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.3739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.4114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.1541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.9608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.9437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.2683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.8533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.2633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.4816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.7107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.7178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.6571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.8040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.8188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1107.6764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.8434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.1390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1059.6157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.8467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.5339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.8834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.3820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.4884, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48806366047745353
Sentence level Krippendorff's alpha for Premises:  0.45092838196286467
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 169
	Prediction sentences having premises: 488
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 66
	Prediction Sentence having neither claim nor premise: 163
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 561
	Sentences having claim in only one of reference or prediction: 193
	Sentences having premise in both reference and prediction: 547
	Sentences having premise in only one of reference or prediction: 207
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(581.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(645.4334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.4947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.6913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.4637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.0378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.1339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.9000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.8607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.0441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.9581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.5625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.3608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.4044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.2997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.0529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.2222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.0504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.1637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.7599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.2689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.1089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(952.4162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1361.2598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.7019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.5002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.2935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.8334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.8859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.8289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.4592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(873.0955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.6455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.1652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(415.8128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.4962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.2842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.0789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.0582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.7274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.2141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.7412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.6707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.9088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.2508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.3849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.5084, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4482758620689655
Sentence level Krippendorff's alpha for Premises:  0.4005305039787799
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 110
	Prediction sentences having premises: 509
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 44
	Prediction Sentence having neither claim nor premise: 179
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 546
	Sentences having claim in only one of reference or prediction: 208
	Sentences having premise in both reference and prediction: 528
	Sentences having premise in only one of reference or prediction: 226
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(485.0599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.4181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.6224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.5799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.9500, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.1117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(647.7877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.8748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.5493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.5721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(497.2391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.2543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.0781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.8416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.9349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.3886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(747.9677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.5871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.4650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.6115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.2146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.6738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.0378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.0029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.9875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.1991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.7682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.6553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.0759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.7777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.7807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.8956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.0837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.1062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.1141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(847.8752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.7093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.2853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.2344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.3557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.3497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.2666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.1520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.8791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.4192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.9468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.1534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.8984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.9736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.8490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.3051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.5415, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3952254641909815
Sentence level Krippendorff's alpha for Premises:  0.4535809018567639
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 344
	Prediction sentences having premises: 299
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 71
	Prediction Sentence having neither claim nor premise: 182
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 526
	Sentences having claim in only one of reference or prediction: 228
	Sentences having premise in both reference and prediction: 548
	Sentences having premise in only one of reference or prediction: 206
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(313.5883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(320.3014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.8650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.4812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.0981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.1563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.0598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.0952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.7793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.5964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.5360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.3106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.2288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.1070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.0616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.6147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.9199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.8659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.8592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.8485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.4449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.9770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.7136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.9110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.8858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.4738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.7758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.3336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.5845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.3542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.3809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.7972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.3376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.9442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.4314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.9069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.9729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.7018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1450, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.3375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.4023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.1062, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.45623342175066317
Sentence level Krippendorff's alpha for Premises:  0.47745358090185674
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 247
	Prediction sentences having premises: 438
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 84
	Prediction Sentence having neither claim nor premise: 153
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 549
	Sentences having claim in only one of reference or prediction: 205
	Sentences having premise in both reference and prediction: 557
	Sentences having premise in only one of reference or prediction: 197
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(267.2646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.9482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.7275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.6802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.5523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.0143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.4215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.0623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.5180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.5375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.3541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.9555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.8603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.7332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.6885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.9250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.7852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.6670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.9764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.3248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.1591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.4405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.9427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.6651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.7693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.3072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.2408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.4103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.4355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.8269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.0841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.7073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.3688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.8332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.2218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.0819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.7151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.7885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.0297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.6015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.2052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.0352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.9602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.7755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.1704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.3171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.4303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6650, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4297082228116711
Sentence level Krippendorff's alpha for Premises:  0.506631299734748
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 283
	Prediction sentences having premises: 363
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 189
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 539
	Sentences having claim in only one of reference or prediction: 215
	Sentences having premise in both reference and prediction: 568
	Sentences having premise in only one of reference or prediction: 186
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(179.6228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.6865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.8723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.7831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.3872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.3153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.6928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.6655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.2730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.7051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.8594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.2409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.9178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.0622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.8246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.5844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.9469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.6565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.7598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.8321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.9310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.8279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.5694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.9562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.5106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.8837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.8000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.1718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.5083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.1864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.9119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.9027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.4600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.2807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.0863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.0242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.7783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.3116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.4962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.7444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.8262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.9463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.2767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.6089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.8361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.3802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.7361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.2972, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.46153846153846156
Sentence level Krippendorff's alpha for Premises:  0.4854111405835544
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 273
	Prediction sentences having premises: 405
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 93
	Prediction Sentence having neither claim nor premise: 169
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 551
	Sentences having claim in only one of reference or prediction: 203
	Sentences having premise in both reference and prediction: 560
	Sentences having premise in only one of reference or prediction: 194
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(143.8727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.8399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.6120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.8383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2873, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.9041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.0382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.9037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.7298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.6903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.4827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.7831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.2319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.1184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.2318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.8540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.3908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.3766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.5443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.6784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.2406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.3276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.0835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.9262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.4193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.7791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.1043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.8294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.4461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.5100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.4963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.7480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.1436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.5452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.1096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.2562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.7855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.4279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.7093, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4190981432360743
Sentence level Krippendorff's alpha for Premises:  0.4907161803713528
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 299
	Prediction sentences having premises: 367
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 98
	Prediction Sentence having neither claim nor premise: 186
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 535
	Sentences having claim in only one of reference or prediction: 219
	Sentences having premise in both reference and prediction: 562
	Sentences having premise in only one of reference or prediction: 192
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(112.2626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.7840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.5033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.9739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.2287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.5367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.0146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.8567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.0319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.1193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.3467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(180.4758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.5399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.9834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.3783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.4036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.2869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.4038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.4095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.0932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.0031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.5403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.3016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.8273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.1344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.9597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.3707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.7532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.5502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.9670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.6010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.2368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.8952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3214, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4456233421750663
Sentence level Krippendorff's alpha for Premises:  0.4960212201591512
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 273
	Prediction sentences having premises: 409
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 98
	Prediction Sentence having neither claim nor premise: 170
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 545
	Sentences having claim in only one of reference or prediction: 209
	Sentences having premise in both reference and prediction: 564
	Sentences having premise in only one of reference or prediction: 190
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(88.2446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.5123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.7809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.9440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.4422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.8086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.2372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.3742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.9334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.4582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.0326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.8341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.4565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.5716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.5583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.2565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.3954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.1584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.8661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.9120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.9394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.7367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.8158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.6402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.2494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.0257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.7737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.1633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.2344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.1924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.7428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.8022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.2176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.2890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.1107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.7620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.7084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.7796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.1198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.2368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.8395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1722, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4297082228116711
Sentence level Krippendorff's alpha for Premises:  0.47214854111405835
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 293
	Prediction sentences having premises: 374
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 97
	Prediction Sentence having neither claim nor premise: 184
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 539
	Sentences having claim in only one of reference or prediction: 215
	Sentences having premise in both reference and prediction: 555
	Sentences having premise in only one of reference or prediction: 199
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(75.8236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.3549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.7555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.8286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.2725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.7503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.9635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.0614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.6194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.1297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.9673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.5286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.2828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.8118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.4437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.9577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.5078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.2968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.2484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.1147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.0811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.4899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.2383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.8265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.7641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.7835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.8485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.0708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.2691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.2600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.6548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.6285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.1518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.1070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.3160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.2415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.6995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.0118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.2794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.9565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4575, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4482758620689655
Sentence level Krippendorff's alpha for Premises:  0.4907161803713528
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 278
	Prediction sentences having premises: 405
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 99
	Prediction Sentence having neither claim nor premise: 170
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 546
	Sentences having claim in only one of reference or prediction: 208
	Sentences having premise in both reference and prediction: 562
	Sentences having premise in only one of reference or prediction: 192
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(58.8290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.8256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.7312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.0299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.2038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.7491, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.8771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.7016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.2606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.8180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.0383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.2784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.7957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.4559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.0296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.4017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.0644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.0254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.8256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.2593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.4286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.3209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.3588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.9376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.0350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.6052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.7852, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4.0946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.8824, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.2213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.9223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.4321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.0360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.2927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.1089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.8153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.4654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.5122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8143, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4350132625994695
Sentence level Krippendorff's alpha for Premises:  0.48806366047745353
Additional attributes: 
	Total Sentences: 754
	Prediction setences having claims: 287
	Prediction sentences having premises: 384
	Reference setences having claims: 264
	Reference sentences having premises: 319


	Prediction Sentence having both claim and premise: 100
	Prediction Sentence having neither claim nor premise: 183
	Reference Sentence having both claim and premise: 50
	Reference Sentence having neither claim nor premise: 221


	Sentences having claim in both reference and prediction: 541
	Sentences having claim in only one of reference or prediction: 213
	Sentences having premise in both reference and prediction: 561
	Sentences having premise in only one of reference or prediction: 193
				 Metric computations: None


		-------------RUN 5-----------
			------------EPOCH 1---------------
Loss:  tensor(1817.8120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2792.2930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1845.3518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2432.5005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3228.0537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1834.2094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2498.0229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1922.4594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1790.6084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2033.8611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1736.5374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.1000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2233.0112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2998.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2800.8193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1445.7946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2971.5791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.9348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2563.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2141.4624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2911.0952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2133.6819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1928.7333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2568.5142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3109.1047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2219.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1659.6826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1755.8448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1399.1470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1218.4944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1011.8755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.0834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1899.9529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2105.2175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1453.6865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2537.2036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1771.9841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1794.9114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1579.6768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2142.9517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1459.1730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(704.4487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.5382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1924.0558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2156.0090, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1594.4470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2336.8281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1476.4534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1557.4077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1536.9561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1352.1217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1267.7036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1324.4866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1883.9626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(746.7537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2326.6655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1665.5520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.4543, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4188376753507014
Sentence level Krippendorff's alpha for Premises:  0.3827655310621243
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 118
	Prediction sentences having premises: 292
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 14
	Prediction Sentence having neither claim nor premise: 103
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 354
	Sentences having claim in only one of reference or prediction: 145
	Sentences having premise in both reference and prediction: 345
	Sentences having premise in only one of reference or prediction: 154
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1052.8604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1688.5393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1095.0679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1509.2194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2510.5190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1329.7509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1829.3339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1371.7175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1368.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1577.0786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1432.5659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(750.1271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1844.5991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2531.9468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2293.9746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1152.7488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2444.5264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1534.8257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2102.0125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1772.9408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2442.6917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1657.5969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1573.8555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2286.5771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2819.7485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1874.6816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.9036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1501.9629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1182.3152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(932.5095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.8499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.0366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1519.0548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1556.9215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1209.4788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2146.9771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1383.4696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.1560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1263.7205, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1873.0719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1330.3094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.5954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.9032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1597.9404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1723.3530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1335.0413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1972.3405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.6545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1298.2644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1425.8687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.7529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.0829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.0537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1603.0083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.6169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1989.2601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1330.8228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(825.7098, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.47895791583166336
Sentence level Krippendorff's alpha for Premises:  0.5230460921843687
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 207
	Prediction sentences having premises: 237
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 30
	Prediction Sentence having neither claim nor premise: 85
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 369
	Sentences having claim in only one of reference or prediction: 130
	Sentences having premise in both reference and prediction: 380
	Sentences having premise in only one of reference or prediction: 119
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(906.4637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1384.2561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.2050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.6143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2055.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1077.4014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1494.3938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1099.4546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1086.9447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1353.5374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1156.9253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.1844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1579.7610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1989.9072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1895.2336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(972.3823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1969.2529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.0461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1630.4740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1589.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2071.1506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1238.3936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.7388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1904.4789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2254.8950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1412.3857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.7271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1114.2079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(986.4368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.1964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.4933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.0576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1331.2246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1371.9966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.0392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1931.0989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1199.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1262.2842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.2927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1512.8457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.6418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.6057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.8206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1339.9485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1545.6862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1070.3064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1666.2128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.3715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.3693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.7063, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.2653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(910.4588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.3311, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.1687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.6009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.9771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.5507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.3696, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3226452905811623
Sentence level Krippendorff's alpha for Premises:  0.4188376753507014
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 296
	Prediction sentences having premises: 141
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 25
	Prediction Sentence having neither claim nor premise: 87
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 330
	Sentences having claim in only one of reference or prediction: 169
	Sentences having premise in both reference and prediction: 354
	Sentences having premise in only one of reference or prediction: 145
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(826.7048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1187.3679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.6626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.1049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1826.2994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.0076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1301.2966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(902.5380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(859.3397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1143.0248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(939.7382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.5447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.1411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1478.3054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1533.9064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(790.8563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1688.7461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.2803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1294.6515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1373.2430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1807.7798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.4751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.8239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1712.6606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1781.9783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1085.2463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.6750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(870.5647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.6907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.1933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.0251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.3592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.2599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1219.5256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(867.1686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1658.3748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.8590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1092.8872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.4692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1303.7830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.7860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.5056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.1765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1253.9465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1207.4226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.2588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(755.3127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.7833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.4410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.2288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(857.5433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(962.0768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1548.1930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.6960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1215.1462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.4744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.2972, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5190380761523046
Sentence level Krippendorff's alpha for Premises:  0.4709418837675351
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 155
	Prediction sentences having premises: 302
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 35
	Prediction Sentence having neither claim nor premise: 77
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 379
	Sentences having claim in only one of reference or prediction: 120
	Sentences having premise in both reference and prediction: 367
	Sentences having premise in only one of reference or prediction: 132
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(531.2555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1090.7808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.4158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.3501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1562.0066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.4814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1108.5192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.2955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.9132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.4690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.9745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.2169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(933.1522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1737.1492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1364.7998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.1653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1385.1821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.1593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1357.9570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.3647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1466.4974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(799.7847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.9787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1289.2307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1532.5591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.1116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.3734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.4405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.4271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.8811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.8342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.4217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1269.2218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1328.0884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1855.0867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.7283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.2537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(803.9414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.4011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.7061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.0470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.5785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1032.9021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(665.7429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.0587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.3049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(703.7319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1002.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.2190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.4778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.5973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1299.5176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.8661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1147.9485, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.6136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.7225, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.15030060120240485
Sentence level Krippendorff's alpha for Premises:  0.36673346693386777
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 355
	Prediction sentences having premises: 118
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 40
	Prediction Sentence having neither claim nor premise: 66
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 287
	Sentences having claim in only one of reference or prediction: 212
	Sentences having premise in both reference and prediction: 341
	Sentences having premise in only one of reference or prediction: 158
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(665.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1062.3131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.0065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.1819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1624.7646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.9777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.6149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.1663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.4719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.4236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(639.4569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.5176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(797.0244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.0591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1008.7084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.4320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1199.1226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.9125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.3967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.5493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1389.4158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.0318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.0483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.2212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.2874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.3226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.7495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.8315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.6412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.3705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.9823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.4162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.4425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.8966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.7383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.3762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.7998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(774.9547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.9209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.4313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.5599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.7659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.4610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.5144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.8875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.2416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.1531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.4393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(507.8667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.3376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.1928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.2453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.2423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.1268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.6444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.9778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.4144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.3409, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3106212424849699
Sentence level Krippendorff's alpha for Premises:  0.4348697394789579
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 311
	Prediction sentences having premises: 131
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 45
	Prediction Sentence having neither claim nor premise: 102
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 327
	Sentences having claim in only one of reference or prediction: 172
	Sentences having premise in both reference and prediction: 358
	Sentences having premise in only one of reference or prediction: 141
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(399.5963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.2571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.3714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(353.6959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1353.6617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.0469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.9389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.2863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.4974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.0438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.4380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.0547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.3259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(863.6920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.8767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(459.9324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.5233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(696.4773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.8915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.7766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1133.2107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.0281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.1716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(998.9078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.8936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.6408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.3154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.3531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.3433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.8057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(997.5046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1130.3467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.3203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1535.3232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.6047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.1996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.0472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.4370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.7197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.5103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(629.9049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(810.5618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.9934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.9921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.6336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.7163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.7601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.3309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.1152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.5988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.1727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.1284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.3225, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.27454909819639284
Sentence level Krippendorff's alpha for Premises:  0.45490981963927857
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 322
	Prediction sentences having premises: 154
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 42
	Prediction Sentence having neither claim nor premise: 65
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 318
	Sentences having claim in only one of reference or prediction: 181
	Sentences having premise in both reference and prediction: 363
	Sentences having premise in only one of reference or prediction: 136
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(319.5475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(729.5959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.6511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.2776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1549.3767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.8987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.8813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.3804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.4066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.9844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.8401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(852.7188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1266.5031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1062.7576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.1694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.7344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.3574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.1125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(698.3648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.4290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.9461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.4546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.2923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(866.9917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.5467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.3494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.4188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.0298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.9330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.1976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.5299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(784.2787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.2151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(622.4355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.5823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(718.2909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.0420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.6967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(772.2772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(853.5320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.4001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.3364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.9117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.4007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.7086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.7213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.0294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.5139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.6866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.1601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(662.6831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.3906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.0266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.7160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.3357, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5070140280561122
Sentence level Krippendorff's alpha for Premises:  0.5190380761523046
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 202
	Prediction sentences having premises: 284
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 65
	Prediction Sentence having neither claim nor premise: 78
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 376
	Sentences having claim in only one of reference or prediction: 123
	Sentences having premise in both reference and prediction: 379
	Sentences having premise in only one of reference or prediction: 120
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(150.8138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(785.9674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.7588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.6750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(975.3988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.7530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.7510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.1155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.6774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.5568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.1946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.5574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.3366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1229.4653, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.3119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.7314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1246.0574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.6206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.5663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1249.4946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.2195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.9552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.1810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.0112, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.1916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.7381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.6019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.0512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.1722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.0829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.1376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.6888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.1807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.2517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.1411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.9022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.8257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.8445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.8041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.5897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.8162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.2268, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.9742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.8520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.5363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.8044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.4695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(484.6839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.4456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.5381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.3378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.2886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.5522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.9362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.2755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.3642, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5270541082164328
Sentence level Krippendorff's alpha for Premises:  0.41482965931863724
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 101
	Prediction sentences having premises: 326
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 30
	Prediction Sentence having neither claim nor premise: 102
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 381
	Sentences having claim in only one of reference or prediction: 118
	Sentences having premise in both reference and prediction: 353
	Sentences having premise in only one of reference or prediction: 146
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(287.1463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.9703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.7556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.2883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.2598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.7213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.4736, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.8563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.2897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.8784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.1945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.6527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.4416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.3579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.0164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.1262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.5870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.0431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.3353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.1885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.4352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.0109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(724.7933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(856.0975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.1417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.1091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.8497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.3080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.1265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.9175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.2729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.8806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.5617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.1136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.7584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.3045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.1295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.3696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.5548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.9104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.8908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.6094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.1887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(491.7906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(325.8248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.7912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.0738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.6893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.4744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.5308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.0172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.5361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.6609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.3542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.1359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.5258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.8065, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5591182364729459
Sentence level Krippendorff's alpha for Premises:  0.342685370741483
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 107
	Prediction sentences having premises: 356
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 41
	Prediction Sentence having neither claim nor premise: 77
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 389
	Sentences having claim in only one of reference or prediction: 110
	Sentences having premise in both reference and prediction: 335
	Sentences having premise in only one of reference or prediction: 164
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(144.8433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.5482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.2335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.9626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.3165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.4156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(732.2797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.2654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.7241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.4027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.6606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.9290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.8408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.2384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.3462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.3227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.7639, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.1052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.2417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.0750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(730.2279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.5954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(252.0062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.0955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.5186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(360.4002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.0194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.7790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.8686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.6431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.8527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.8998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.5856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.8964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.3972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(626.4335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.4915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.5066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.2077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.7591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.3347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.8130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.1766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.7009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.0908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.8718, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.6766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.1879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.3831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.9547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.2015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.6667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.1555, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.6613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.6699, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5070140280561122
Sentence level Krippendorff's alpha for Premises:  0.4348697394789579
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 166
	Prediction sentences having premises: 315
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 46
	Prediction Sentence having neither claim nor premise: 64
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 376
	Sentences having claim in only one of reference or prediction: 123
	Sentences having premise in both reference and prediction: 358
	Sentences having premise in only one of reference or prediction: 141
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(95.0465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.9879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.1589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.8794, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.4808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(553.1367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(403.2181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.6992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.7464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.4356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.9818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.8318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.1346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.7000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.6814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1105.3977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.9418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.9701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.4614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(885.1092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.8099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.4072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.8668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.6113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.1373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.9918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.6307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.9211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.5460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.8071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.9446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(517.0037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.4374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(463.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.1389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.7701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(272.1560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.8217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.8116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.9997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.3129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.4230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.3909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.1307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.3686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.9809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.0847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.6167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.6117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.7214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.3833, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.503006012024048
Sentence level Krippendorff's alpha for Premises:  0.4949899799599199
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 193
	Prediction sentences having premises: 284
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 52
	Prediction Sentence having neither claim nor premise: 74
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 375
	Sentences having claim in only one of reference or prediction: 124
	Sentences having premise in both reference and prediction: 373
	Sentences having premise in only one of reference or prediction: 126
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(58.5060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.4264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.1497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.1114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.2972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.6525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.1124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.6733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(274.3810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.9635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.7995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.8403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.4028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.2386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.3517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.8848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(440.2116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.3854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.3029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.5597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.8671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.1275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.3915, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1058.8345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.9927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.8459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.8704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.4984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.0699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.4637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.4196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.4306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.6149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(430.9983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.4790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.2086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.0280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.3999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.1163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.8971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.9599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.0643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.1308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.9954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.3138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.3802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.0002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.5052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.3489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.1477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.2871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.5519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.0649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.0292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.1897, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5190380761523046
Sentence level Krippendorff's alpha for Premises:  0.4869739478957916
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 189
	Prediction sentences having premises: 216
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 57
	Prediction Sentence having neither claim nor premise: 151
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 379
	Sentences having claim in only one of reference or prediction: 120
	Sentences having premise in both reference and prediction: 371
	Sentences having premise in only one of reference or prediction: 128
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(109.0444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.0054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.8034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.4688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.8197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.2589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(471.3705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.5707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.8945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.4834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.6324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.2308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.0737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.3107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.6895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.8320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.7285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.2573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.2289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.3079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.7430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.1254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.7435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.8109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.9709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.0678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.1232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.0391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.4850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.7768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.1511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.2928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.0923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.7288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.5784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.8595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.3752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(327.2779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.8719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.6840, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.0651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.1023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.3576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.0277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.3556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.2614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.1909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.9720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.1920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.0424, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5230460921843687
Sentence level Krippendorff's alpha for Premises:  0.43887775551102204
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 170
	Prediction sentences having premises: 310
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 48
	Prediction Sentence having neither claim nor premise: 67
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 380
	Sentences having claim in only one of reference or prediction: 119
	Sentences having premise in both reference and prediction: 359
	Sentences having premise in only one of reference or prediction: 140
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(43.2394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.6616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.7672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.0372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.8865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.8939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.6522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.0762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.1226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.3209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.5823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.4602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.9383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.3602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.6362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.5802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.6095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.3702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.9103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.9519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.0472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.5197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.5097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.6730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.6892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.1896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5671, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.8634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.8613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.0800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.3431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.7827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.3045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.4187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.6739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.3208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.7540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.1731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.7858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.0279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.6537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.1696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.8944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.6150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.5110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.2132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.3369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.7636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.8385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.7889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.4312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.5320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6612, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.5150300601202404
Sentence level Krippendorff's alpha for Premises:  0.5190380761523046
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 190
	Prediction sentences having premises: 264
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 55
	Prediction Sentence having neither claim nor premise: 100
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 378
	Sentences having claim in only one of reference or prediction: 121
	Sentences having premise in both reference and prediction: 379
	Sentences having premise in only one of reference or prediction: 120
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(37.8502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.7827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.2533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.8777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.5328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.0950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.3586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.7584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.7167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.1141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.4582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.8885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.7866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.5095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.6068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.9375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.8684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.3419, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.9314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.6770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.8384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.3969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.5262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.2908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.4867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.9603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.5179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(291.3748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.2529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.2912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.3053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.9632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.9009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.9883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.4420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.8046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.5108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.6925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.3319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.1271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.8125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.3288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.0521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.5391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.1911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.9168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.7054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.4683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.0327, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.498997995991984
Sentence level Krippendorff's alpha for Premises:  0.4709418837675351
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 186
	Prediction sentences having premises: 278
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 50
	Prediction Sentence having neither claim nor premise: 85
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 374
	Sentences having claim in only one of reference or prediction: 125
	Sentences having premise in both reference and prediction: 367
	Sentences having premise in only one of reference or prediction: 132
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(23.5074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.7478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.0325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.5777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.2411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(237.3264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.5673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.3005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.4094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.5608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.5283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.0826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.1252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.7577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.8943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.2863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.2978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.3731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.5388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.0984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.1620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.2190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.2271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.0963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.1912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.9029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.7890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.3378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.6636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.5255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.0546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.8885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.5748, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.2662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.8077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.2321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.9647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.6985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.9096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.8081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.3510, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.3065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.9392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.6338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.1069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.5611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.8200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.2996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.5004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2151, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.49098196392785576
Sentence level Krippendorff's alpha for Premises:  0.4829659318637275
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 182
	Prediction sentences having premises: 283
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 55
	Prediction Sentence having neither claim nor premise: 89
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 372
	Sentences having claim in only one of reference or prediction: 127
	Sentences having premise in both reference and prediction: 370
	Sentences having premise in only one of reference or prediction: 129
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(20.1621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.3324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.0553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.9609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.3582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.1524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.7854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.0072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.1733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.7158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.5623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.4026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.4776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.3856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.3076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.9200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.0914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.2071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.8890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.7328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.9591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.1774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.0147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.9483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.1449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.4210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.6180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.8048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.4162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.5554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.0740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.5741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.0073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.5610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.5518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.7935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.1711, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.3322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.9503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.1653, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4468937875751503
Sentence level Krippendorff's alpha for Premises:  0.503006012024048
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 211
	Prediction sentences having premises: 258
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 62
	Prediction Sentence having neither claim nor premise: 92
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 361
	Sentences having claim in only one of reference or prediction: 138
	Sentences having premise in both reference and prediction: 375
	Sentences having premise in only one of reference or prediction: 124
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(15.7089, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.7283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.0375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.1302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.8071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.7377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.2436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.1574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.1600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.8217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.6704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.8262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.5777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.0042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.3675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.6859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.2682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.1877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.0084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.5364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.9099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.0999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.5934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.6580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.2356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.8685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.5027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.5176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.0665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.1767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.9229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.7663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.3947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.8182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.9783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.1498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.8498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.3271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.3845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.9059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.4715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.5291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.7065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.3318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.5693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.3778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.7084, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4949899799599199
Sentence level Krippendorff's alpha for Premises:  0.47494989979959923
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 185
	Prediction sentences having premises: 283
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 55
	Prediction Sentence having neither claim nor premise: 86
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 373
	Sentences having claim in only one of reference or prediction: 126
	Sentences having premise in both reference and prediction: 368
	Sentences having premise in only one of reference or prediction: 131
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(11.8581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.2592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.9872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.4974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.5967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.0120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.3947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.4550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.2040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.2872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.3590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.4777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.8096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.6666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.3468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.7095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.3524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.8458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.1925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.4320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.3184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.5056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.3895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.8651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.2478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.0587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.6445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.8543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.1502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.9412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.8369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.1514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.5124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.0834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.5519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.7863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.4721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.6022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.6886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.9298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.3299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.7285, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.49098196392785576
Sentence level Krippendorff's alpha for Premises:  0.4949899799599199
Additional attributes: 
	Total Sentences: 499
	Prediction setences having claims: 184
	Prediction sentences having premises: 282
	Reference setences having claims: 173
	Reference sentences having premises: 208


	Prediction Sentence having both claim and premise: 54
	Prediction Sentence having neither claim nor premise: 87
	Reference Sentence having both claim and premise: 30
	Reference Sentence having neither claim nor premise: 148


	Sentences having claim in both reference and prediction: 372
	Sentences having claim in only one of reference or prediction: 127
	Sentences having premise in both reference and prediction: 373
	Sentences having premise in only one of reference or prediction: 126
				 Metric computations: None
	Train size: 50 Test size: 50


		-------------RUN 1-----------
			------------EPOCH 1---------------
Loss:  tensor(1872.4136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3183.5442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1770.2771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2250.5627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2208.6467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2447.2612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1901.2118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3125.6064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1613.0966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3190.9980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1660.0154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2137.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2663.9109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2265.5950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1638.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1543.0785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1280.6240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1484.7356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2766.2681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2486.8652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.7310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.0751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1094.5737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1967.9420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2510.9468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1927.0370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1950.7288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2127.5552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.3909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2514.4487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2691.2854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2882.0317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2145.3193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3528.4448, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24702058504875402
Sentence level Krippendorff's alpha for Premises:  0.19501625135427947
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 234
	Prediction sentences having premises: 1065
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 42
	Prediction Sentence having neither claim nor premise: 589
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1151
	Sentences having claim in only one of reference or prediction: 695
	Sentences having premise in both reference and prediction: 1103
	Sentences having premise in only one of reference or prediction: 743
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1247.5443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2211.3931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1234.8694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.1987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1851.2935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2036.0286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1527.6858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2576.1460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1315.6182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2743.3223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1287.0818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1733.4166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2317.3225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1906.4083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1403.3479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1376.6152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1146.4591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.9746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2754.6167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2443.7217, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(750.3206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.6822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1007.9990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1659.6096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2144.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1443.2971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1513.7242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1787.1578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1408.0944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2349.8762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2392.5244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2443.9971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1813.3872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3172.7390, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.27518959913326113
Sentence level Krippendorff's alpha for Premises:  0.2632719393282773
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 180
	Prediction sentences having premises: 1316
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 52
	Prediction Sentence having neither claim nor premise: 402
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1177
	Sentences having claim in only one of reference or prediction: 669
	Sentences having premise in both reference and prediction: 1166
	Sentences having premise in only one of reference or prediction: 680
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1122.9937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2002.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1121.3987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1365.1235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1721.8779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2009.8079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1476.4321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2454.0195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.7074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2224.1343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1079.9182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.2573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1970.3442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1498.7175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1164.8530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1196.6488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1029.7662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1164.9448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2172.0029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1977.8950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.7011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(805.6683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.2297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1498.1646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1914.3245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1260.1602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1340.0223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1658.9596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1214.5793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2027.2598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.5000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2304.0029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1458.4972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2829.0464, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.27085590465872156
Sentence level Krippendorff's alpha for Premises:  0.41820151679306605
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 702
	Prediction sentences having premises: 949
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 163
	Prediction Sentence having neither claim nor premise: 358
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1173
	Sentences having claim in only one of reference or prediction: 673
	Sentences having premise in both reference and prediction: 1309
	Sentences having premise in only one of reference or prediction: 537
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(954.1376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1614.5032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.7239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(983.8336, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1250.5212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1840.9207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1202.3010, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2142.5947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(933.2626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1971.5210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(816.1801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1273.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1703.1169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.7173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.4202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.2656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(904.4460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1048.5886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1934.3933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1798.2563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.4984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.7968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.1026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1287.7657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1715.2947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.8800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1013.3683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1454.0402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1030.4506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1620.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1938.2079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1990.9023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.6746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2549.4756, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38894907908992415
Sentence level Krippendorff's alpha for Premises:  0.4431202600216685
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 673
	Prediction sentences having premises: 1104
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 246
	Prediction Sentence having neither claim nor premise: 315
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1282
	Sentences having claim in only one of reference or prediction: 564
	Sentences having premise in both reference and prediction: 1332
	Sentences having premise in only one of reference or prediction: 514
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(833.7269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1320.9175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.5968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.0328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.2467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1588.5105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(961.1927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1843.2571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.3984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1645.3979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(545.0944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1047.0236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1474.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.9772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.4900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.6285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.9996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.6078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1653.8347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1601.5308, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.9116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.4649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1053.5059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1438.3446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.5832, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.9464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1225.6469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(899.8893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1406.0022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1708.4578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1722.5615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1041.0024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2253.5867, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4387865655471289
Sentence level Krippendorff's alpha for Premises:  0.39328277356446373
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 475
	Prediction sentences having premises: 1228
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 181
	Prediction Sentence having neither claim nor premise: 324
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1328
	Sentences having claim in only one of reference or prediction: 518
	Sentences having premise in both reference and prediction: 1286
	Sentences having premise in only one of reference or prediction: 560
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(713.8876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.4773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.8646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.7489, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.4766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1322.9364, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.6830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1588.3210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.4026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1380.0596, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.8979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(851.4380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1324.8154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(775.8773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.0215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.2202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(636.0185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(838.4839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1505.1185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1522.9767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.0513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.9777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.4656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.5231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.9910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.8702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.4039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1056.0919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.4474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1087.4689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1293.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1438.2130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(829.2437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1956.7006, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.39328277356446373
Sentence level Krippendorff's alpha for Premises:  0.40520043336944744
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 311
	Prediction sentences having premises: 1191
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 112
	Prediction Sentence having neither claim nor premise: 456
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1286
	Sentences having claim in only one of reference or prediction: 560
	Sentences having premise in both reference and prediction: 1297
	Sentences having premise in only one of reference or prediction: 549
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(732.7574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.8534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.2339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.8983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.0709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.2408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(811.4066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1461.9504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.7255, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1428.2484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.4352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.7040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1045.7373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.6408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.6735, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.5787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.1202, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.9542, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1232.3145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.8062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.5969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.2169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.7701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.9083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1162.7269, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(417.1979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(472.3061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.0129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.2085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.2490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1540.9493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1648.2366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(958.2377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2113.3564, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32827735644637057
Sentence level Krippendorff's alpha for Premises:  0.47995666305525464
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 919
	Prediction sentences having premises: 726
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 179
	Prediction Sentence having neither claim nor premise: 380
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1226
	Sentences having claim in only one of reference or prediction: 620
	Sentences having premise in both reference and prediction: 1366
	Sentences having premise in only one of reference or prediction: 480
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(535.4479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.8386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.6144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.5051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.2239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.0887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.5040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1153.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.0999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1167.4036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.8853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.1044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.8787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.7352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.8481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(901.3981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.4260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.6276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1317.4515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1295.0288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(334.2025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(580.8325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.7237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1076.9335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1189.4688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.5895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.5131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(984.6846, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.1967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.0896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1182.5540, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.6638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1500.6803, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3206933911159263
Sentence level Krippendorff's alpha for Premises:  0.44637053087757317
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 990
	Prediction sentences having premises: 575
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 188
	Prediction Sentence having neither claim nor premise: 469
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1219
	Sentences having claim in only one of reference or prediction: 627
	Sentences having premise in both reference and prediction: 1335
	Sentences having premise in only one of reference or prediction: 511
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(501.5617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.6384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.7814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(282.6586, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.3848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.2712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(642.3896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1357.3411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.0331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1640.2129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.7305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.6337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1037.8726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.1816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(587.5162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.7578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.9804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.2228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.0376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1215.4454, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.5749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.2805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(544.7782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.4008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.6924, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.5435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(680.0109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(826.5739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.1893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1039.0929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.1149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1917.7007, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4279523293607801
Sentence level Krippendorff's alpha for Premises:  0.43228602383531955
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 521
	Prediction sentences having premises: 1170
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 177
	Prediction Sentence having neither claim nor premise: 332
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1318
	Sentences having claim in only one of reference or prediction: 528
	Sentences having premise in both reference and prediction: 1322
	Sentences having premise in only one of reference or prediction: 524
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(356.2673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.6472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.6741, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.2350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.7193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(840.8868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.4669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1273.0977, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.5815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(965.3946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.7192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.6553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.6583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.2780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(359.5669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.9134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.2573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.0527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1038.3423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.9119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.1533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.0097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.4912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(702.0383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.3333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.6374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.4344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.6683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(668.5743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(563.3399, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.6797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.4893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1160.9250, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3531960996749729
Sentence level Krippendorff's alpha for Premises:  0.5059588299024919
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 844
	Prediction sentences having premises: 900
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 200
	Prediction Sentence having neither claim nor premise: 302
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1249
	Sentences having claim in only one of reference or prediction: 597
	Sentences having premise in both reference and prediction: 1390
	Sentences having premise in only one of reference or prediction: 456
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(238.2104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(474.6701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.5355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(833.3285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(556.1438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1097.3242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.8752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(960.6141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.1799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.9009, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.2163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.9077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.0954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.2121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.9122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(390.8793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.3131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.4537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.6186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.6864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.1585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.5685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(557.8207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.3216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.9928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.4700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.9750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.7360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.8346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.9651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.5304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(986.7231, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4409534127843987
Sentence level Krippendorff's alpha for Premises:  0.5037919826652222
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 567
	Prediction sentences having premises: 978
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 168
	Prediction Sentence having neither claim nor premise: 469
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1330
	Sentences having claim in only one of reference or prediction: 516
	Sentences having premise in both reference and prediction: 1388
	Sentences having premise in only one of reference or prediction: 458
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(214.6728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.1580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.4926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.7008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(533.6097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.1842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.9192, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.8386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.4216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.2943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.0032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.3245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.2195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.7710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.5453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.6393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.0348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.5481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.2094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.2469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.3932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.0745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.0594, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.4579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.8507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(347.7330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(308.9105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.9688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.5803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(865.5907, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40520043336944744
Sentence level Krippendorff's alpha for Premises:  0.485373781148429
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 746
	Prediction sentences having premises: 941
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 206
	Prediction Sentence having neither claim nor premise: 365
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1297
	Sentences having claim in only one of reference or prediction: 549
	Sentences having premise in both reference and prediction: 1371
	Sentences having premise in only one of reference or prediction: 475
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(136.0746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.9337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.6003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.1975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.6834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.9352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.7076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.4996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.1169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.5938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.5325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.2471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.5569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.6599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.8803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.2318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.5105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.4843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.5715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.4132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.6676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(314.2910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.3886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.7548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.0028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.0213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.8259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.7911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.1832, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38894907908992415
Sentence level Krippendorff's alpha for Premises:  0.49404117009750814
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 739
	Prediction sentences having premises: 881
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 192
	Prediction Sentence having neither claim nor premise: 418
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1282
	Sentences having claim in only one of reference or prediction: 564
	Sentences having premise in both reference and prediction: 1379
	Sentences having premise in only one of reference or prediction: 467
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(98.1871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.5331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.8324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.9013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.1926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.3528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.0027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.4772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.4322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.4340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.8693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.0667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.9690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.7805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.9532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.1371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.8383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.5600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.8868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.4994, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.0758, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.9087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.3058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.4760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.6425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.8252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.6726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.1342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.9409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.8837, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4149512459371615
Sentence level Krippendorff's alpha for Premises:  0.4712892741061755
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 615
	Prediction sentences having premises: 1044
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 197
	Prediction Sentence having neither claim nor premise: 384
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1306
	Sentences having claim in only one of reference or prediction: 540
	Sentences having premise in both reference and prediction: 1358
	Sentences having premise in only one of reference or prediction: 488
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(67.1781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.9641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1193, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.7683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.2642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.6162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.7072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.2050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.0287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.2720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.9265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.4129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.0181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.5776, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.5627, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.8750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.5187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.9312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.9323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.6920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.1656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.0978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.1541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.7888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.6353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.0914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.2993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.2804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.4032, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.409534127843987
Sentence level Krippendorff's alpha for Premises:  0.4723726977248104
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 732
	Prediction sentences having premises: 931
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 212
	Prediction Sentence having neither claim nor premise: 395
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1301
	Sentences having claim in only one of reference or prediction: 545
	Sentences having premise in both reference and prediction: 1359
	Sentences having premise in only one of reference or prediction: 487
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(53.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.2755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.2691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.4535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.1541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.5198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.3563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.2388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.4061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.1774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.3860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.1035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.4259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.8814, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.4961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.9186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.2390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.8944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.1569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.4757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.5484, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.1512, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.2028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.0425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.3082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.1956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.4982, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.3879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.1454, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40628385698808234
Sentence level Krippendorff's alpha for Premises:  0.47778981581798485
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 643
	Prediction sentences having premises: 1002
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 192
	Prediction Sentence having neither claim nor premise: 393
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1298
	Sentences having claim in only one of reference or prediction: 548
	Sentences having premise in both reference and prediction: 1364
	Sentences having premise in only one of reference or prediction: 482
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(42.3467, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.2403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.1344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.9722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.4106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.5181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.7901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.5828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.7194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.8696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.2908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.9608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.0654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.3636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.9460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.8905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.3168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.6133, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4065, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.3773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.0576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.8518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.7206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.5139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.5694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.1167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.4413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.2740, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(208.3802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.9450, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41603466955579627
Sentence level Krippendorff's alpha for Premises:  0.4528710725893824
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 676
	Prediction sentences having premises: 983
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 200
	Prediction Sentence having neither claim nor premise: 387
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1307
	Sentences having claim in only one of reference or prediction: 539
	Sentences having premise in both reference and prediction: 1341
	Sentences having premise in only one of reference or prediction: 505
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(35.4968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.5585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.5366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.8225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.1739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.7870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.6053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.1881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.1516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.2441, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.9890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.2865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.8464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.3300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.9458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.3613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.5927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.8346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.6861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.4497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.6763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.7848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.2739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.2156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.8103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.9974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.4605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.1410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.4109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(249.7922, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4084507042253521
Sentence level Krippendorff's alpha for Premises:  0.4734561213434453
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 739
	Prediction sentences having premises: 916
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 215
	Prediction Sentence having neither claim nor premise: 406
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1300
	Sentences having claim in only one of reference or prediction: 546
	Sentences having premise in both reference and prediction: 1360
	Sentences having premise in only one of reference or prediction: 486
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(28.5473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.0231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.9377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.2514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.0853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.3580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.2053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.5265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.6925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.7234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.0288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.2722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.4829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.4765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.1810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.4008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.6506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.1081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.0246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.6703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.7647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.9120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.2056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.1675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.8898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.9996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.0657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.8119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.5054, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4247020585048754
Sentence level Krippendorff's alpha for Premises:  0.46153846153846156
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 592
	Prediction sentences having premises: 1055
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 190
	Prediction Sentence having neither claim nor premise: 389
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1315
	Sentences having claim in only one of reference or prediction: 531
	Sentences having premise in both reference and prediction: 1349
	Sentences having premise in only one of reference or prediction: 497
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(22.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.0465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.7442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.6793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.5747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.7919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.4214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.7597, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.4355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.6799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.6079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.7801, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.6593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.8295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.3050, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(254.2986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.6396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.6517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.2073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.5110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.1004, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.4569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.7990, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.0344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.2565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.5188, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3672806067172264
Sentence level Krippendorff's alpha for Premises:  0.4593716143011918
Additional attributes: 
	Total Sentences: 1846
	Prediction setences having claims: 797
	Prediction sentences having premises: 893
	Reference setences having claims: 681
	Reference sentences having premises: 780


	Prediction Sentence having both claim and premise: 218
	Prediction Sentence having neither claim nor premise: 374
	Reference Sentence having both claim and premise: 125
	Reference Sentence having neither claim nor premise: 510


	Sentences having claim in both reference and prediction: 1262
	Sentences having claim in only one of reference or prediction: 584
	Sentences having premise in both reference and prediction: 1347
	Sentences having premise in only one of reference or prediction: 499
				 Metric computations: None


		-------------RUN 2-----------
			------------EPOCH 1---------------
Loss:  tensor(2002.8567, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3436.7305, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2063.2241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2631.7593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1719.4353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2278.7727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2259.2693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.7152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2212.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1685.2261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2163.5845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2851.0635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1762.4434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1117.8452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1239.4989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.7083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2036.5906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2247.9180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1900.4146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2659.4395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2376.8972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2689.8877, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2810.8442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1415.1738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1755.5425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1571.3868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.6562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1336.4811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2246.2546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1329.0879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(928.7949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.7432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1167.6082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1889.6057, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.21782729805013923
Sentence level Krippendorff's alpha for Premises:  0.1409470752089137
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 195
	Prediction sentences having premises: 1
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 1
	Prediction Sentence having neither claim nor premise: 1600
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1093
	Sentences having claim in only one of reference or prediction: 702
	Sentences having premise in both reference and prediction: 1024
	Sentences having premise in only one of reference or prediction: 771
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1653.1033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2457.0415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1355.6580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1652.3918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1366.0952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1923.6321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1890.4243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1327.2692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1916.9739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1390.9624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1791.2457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2520.8608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1554.8613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.0744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.5698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1023.2897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1644.3135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1833.2554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1492.2432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2316.1213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1901.2946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2347.5962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2469.5686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1288.8152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1617.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1412.1859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1230.1283, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.0692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2143.1143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1218.8367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.2545, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(432.3239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.5847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1637.9833, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.11977715877437323
Sentence level Krippendorff's alpha for Premises:  0.25571030640668524
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 1043
	Prediction sentences having premises: 198
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 71
	Prediction Sentence having neither claim nor premise: 625
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1005
	Sentences having claim in only one of reference or prediction: 790
	Sentences having premise in both reference and prediction: 1127
	Sentences having premise in only one of reference or prediction: 668
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1387.5159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2078.0410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.5558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1382.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.6138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1696.7449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.7300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1115.8738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1677.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1212.6750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1522.5142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2263.8811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.8909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.4194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(920.0391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.1036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1357.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1593.0516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.0796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2013.3462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.6221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1944.7654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2107.8201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.1478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1374.2230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1210.1775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.8774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(967.9666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1943.0701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1024.4797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.2979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.2893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(863.9668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.2048, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.21559888579387188
Sentence level Krippendorff's alpha for Premises:  0.40278551532033424
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 781
	Prediction sentences having premises: 718
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 170
	Prediction Sentence having neither claim nor premise: 466
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1091
	Sentences having claim in only one of reference or prediction: 704
	Sentences having premise in both reference and prediction: 1259
	Sentences having premise in only one of reference or prediction: 536
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1252.9038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1781.5837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.1954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1113.9670, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.6290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1501.2793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.0552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.9080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1411.6587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.6830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1212.4408, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1898.3235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.2688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.1935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(779.6333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.7180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1096.1631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1361.6655, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.7699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1871.5570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1183.5815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1638.7344, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1801.6927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(924.5969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.1934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.6285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.4398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1570.9479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(889.3842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.0712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.8576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.4911, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.25682451253481897
Sentence level Krippendorff's alpha for Premises:  0.4005571030640669
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 778
	Prediction sentences having premises: 910
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 214
	Prediction Sentence having neither claim nor premise: 321
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1128
	Sentences having claim in only one of reference or prediction: 667
	Sentences having premise in both reference and prediction: 1257
	Sentences having premise in only one of reference or prediction: 538
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1129.6991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1592.5076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.3550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(968.3440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.0646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1329.8817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1147.2429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(829.8362, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1231.9806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.9492, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.7781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1702.6536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1081.7452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.5261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(593.7191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.9813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(916.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1127.8738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.8931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1839.3737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1073.7937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1512.7959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1623.8883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.7646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.5397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.5117, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.4626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(465.2520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1216.6412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.1201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.9656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.7194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.5420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.8729, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33593314763231197
Sentence level Krippendorff's alpha for Premises:  0.3559888579387187
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 625
	Prediction sentences having premises: 1072
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 165
	Prediction Sentence having neither claim nor premise: 263
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1199
	Sentences having claim in only one of reference or prediction: 596
	Sentences having premise in both reference and prediction: 1217
	Sentences having premise in only one of reference or prediction: 578
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1067.8518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1547.4536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(605.1348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.1637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(699.7277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.1826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.4766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(780.5012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1209.2240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(839.0975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(972.7048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1966.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1057.4014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.0750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.6568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(452.2448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(745.5675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(880.5807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(657.2507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1303.4207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1049.9099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1468.3650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1556.8845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.2759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(967.1082, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.4461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.4035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.8608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1387.2622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.7858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(644.2413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.6815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.4442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.5955, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.35376044568245124
Sentence level Krippendorff's alpha for Premises:  0.4128133704735376
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 827
	Prediction sentences having premises: 603
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 156
	Prediction Sentence having neither claim nor premise: 521
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1215
	Sentences having claim in only one of reference or prediction: 580
	Sentences having premise in both reference and prediction: 1268
	Sentences having premise in only one of reference or prediction: 527
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(856.7700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1725.4644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.1019, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.1216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.6002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(973.3361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(967.8005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(707.3440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.6113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.2798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(751.9402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1493.7864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1163.5522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.0184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.5359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.9765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(718.6157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.6890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(831.7298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1614.8618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1406.4865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1830.5093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1700.5017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(759.1118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(753.4665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(513.1264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.2153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.7415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1237.6825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(656.4830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.6984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.3971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(567.6410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(920.3400, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.11086350974930359
Sentence level Krippendorff's alpha for Premises:  0.3002785515320334
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 1355
	Prediction sentences having premises: 240
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 82
	Prediction Sentence having neither claim nor premise: 282
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 997
	Sentences having claim in only one of reference or prediction: 798
	Sentences having premise in both reference and prediction: 1167
	Sentences having premise in only one of reference or prediction: 628
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(880.0790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1519.0581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(675.0382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.2151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(577.0544, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1027.8809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.8975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(720.5610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.3359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(625.8241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.1775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.5979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.1073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.1068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.7761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.3040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(743.7677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.7324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1158.6882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(887.2072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1294.6798, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1291.6315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(640.6023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(761.9819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.3851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.9069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.7744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1020.9550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.2966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.2712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.1628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(503.9646, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.0651, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.38830083565459605
Sentence level Krippendorff's alpha for Premises:  0.4306406685236769
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 762
	Prediction sentences having premises: 857
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 181
	Prediction Sentence having neither claim nor premise: 357
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1246
	Sentences having claim in only one of reference or prediction: 549
	Sentences having premise in both reference and prediction: 1284
	Sentences having premise in only one of reference or prediction: 511
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(608.3744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.8328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(396.2844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.1434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.1752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(718.7559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(638.2140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.8249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.4964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.4465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.2181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1196.7812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.3710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.5869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.5097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.5895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.5609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.4700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.4739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.2440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.9267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(889.8281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.4706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.9111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.8693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.3077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.7958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.8770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.1809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(486.4761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.4516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.0007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.4161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.1288, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3938718662952646
Sentence level Krippendorff's alpha for Premises:  0.4607242339832869
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 773
	Prediction sentences having premises: 822
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 206
	Prediction Sentence having neither claim nor premise: 406
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1251
	Sentences having claim in only one of reference or prediction: 544
	Sentences having premise in both reference and prediction: 1311
	Sentences having premise in only one of reference or prediction: 484
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(515.3893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(944.3012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.8595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.5265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.0369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.7301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.4945, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(406.6535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(538.7629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.7581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.5786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1151.6213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.2369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.3088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6095, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.4306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.7811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(453.4115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.7473, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(589.7249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(467.3817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.6376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.2238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.4589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.5023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.0643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.4286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.1917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(525.3789, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.8638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.3644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.0967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.3900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(418.6097, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.28245125348189415
Sentence level Krippendorff's alpha for Premises:  0.45181058495821724
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 1027
	Prediction sentences having premises: 758
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 257
	Prediction Sentence having neither claim nor premise: 267
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1151
	Sentences having claim in only one of reference or prediction: 644
	Sentences having premise in both reference and prediction: 1303
	Sentences having premise in only one of reference or prediction: 492
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(430.6888, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(793.6654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.5456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.1132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.1693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(434.1277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.5215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(882.2909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.7508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.1946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.4420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.8929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.4302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(387.2145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.7302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.9398, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.7410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(601.2976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.4124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.0529, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.9348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.7507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.4791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.3237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.6403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.0263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.1127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.2124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.3885, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4083565459610028
Sentence level Krippendorff's alpha for Premises:  0.4807799442896936
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 744
	Prediction sentences having premises: 802
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 180
	Prediction Sentence having neither claim nor premise: 429
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1264
	Sentences having claim in only one of reference or prediction: 531
	Sentences having premise in both reference and prediction: 1329
	Sentences having premise in only one of reference or prediction: 466
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(347.1055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(723.3946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.0712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.1111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.4127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.1301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.2863, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.9329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.4030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.2849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(685.6559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.8490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.6490, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.6825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.9458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.2733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.7897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.7778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.5281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(480.0072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(582.5228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.3893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.0196, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.8553, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.5349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.7703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.0264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.4984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.9355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.5684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.4465, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.39944289693593316
Sentence level Krippendorff's alpha for Premises:  0.42952646239554315
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 596
	Prediction sentences having premises: 1110
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 208
	Prediction Sentence having neither claim nor premise: 297
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1256
	Sentences having claim in only one of reference or prediction: 539
	Sentences having premise in both reference and prediction: 1283
	Sentences having premise in only one of reference or prediction: 512
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(296.5402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.5986, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.4559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.0929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.7239, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(297.3067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.1148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.0987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.9561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.7329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.5116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.7333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.0495, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.1770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.6074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.9734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.6941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.7845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(317.3565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.6713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.8543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(192.6213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.1403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.4188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.5298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.6304, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.7689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(294.8714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.3624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.9216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.7728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.3923, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.29693593314763234
Sentence level Krippendorff's alpha for Premises:  0.4206128133704735
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 1068
	Prediction sentences having premises: 558
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 194
	Prediction Sentence having neither claim nor premise: 363
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1164
	Sentences having claim in only one of reference or prediction: 631
	Sentences having premise in both reference and prediction: 1275
	Sentences having premise in only one of reference or prediction: 520
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(293.2061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(953.7157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.1688, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.6146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(142.5451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.6752, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.9802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.3254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.4028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(188.6373, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.9064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.3511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.6099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.3559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.0886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.7186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.4076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.0787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.3781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(393.3558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(526.9494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.3313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.7786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.6652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.0734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.0340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.9662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.9099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.5743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.7258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.2855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.4249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.4579, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3816155988857939
Sentence level Krippendorff's alpha for Premises:  0.48969359331476325
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 836
	Prediction sentences having premises: 900
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 255
	Prediction Sentence having neither claim nor premise: 314
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1240
	Sentences having claim in only one of reference or prediction: 555
	Sentences having premise in both reference and prediction: 1337
	Sentences having premise in only one of reference or prediction: 458
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(185.5904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(565.0575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.8811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.9020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.1002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.0558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.8548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.2273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.8337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.0186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.8910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.5974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.4714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.7341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.8161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(303.9363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.2943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.4245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.3515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.3761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.5448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.0071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.9384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.9422, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.9025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.5936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.0599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.0129, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.0131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.9189, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42506963788300833
Sentence level Krippendorff's alpha for Premises:  0.5041782729805013
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 685
	Prediction sentences having premises: 953
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 201
	Prediction Sentence having neither claim nor premise: 358
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1279
	Sentences having claim in only one of reference or prediction: 516
	Sentences having premise in both reference and prediction: 1350
	Sentences having premise in only one of reference or prediction: 445
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(139.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.4525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.1960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.1333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.5109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.7075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.4358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.0146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.2562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.5155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.0421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.9836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.6750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.7172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.3468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.7854, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.7952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.7554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(241.0499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.3041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.9751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.0354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.1624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.1171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(156.7514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.6054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.4348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(8.4947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.5743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.4408, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3448467966573816
Sentence level Krippendorff's alpha for Premises:  0.4986072423398329
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 859
	Prediction sentences having premises: 830
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 219
	Prediction Sentence having neither claim nor premise: 325
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1207
	Sentences having claim in only one of reference or prediction: 588
	Sentences having premise in both reference and prediction: 1345
	Sentences having premise in only one of reference or prediction: 450
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(112.6325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.7845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.4397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.5211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.4066, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.9273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.5851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.5387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.6374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.9160, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.7829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.5453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.7432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.0128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.1717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.0620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(223.3145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.3310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.0514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.2851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5446, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.4138, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.8941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.1386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.1676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.2767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4.8621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.4352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1499, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4016713091922005
Sentence level Krippendorff's alpha for Premises:  0.490807799442897
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 728
	Prediction sentences having premises: 985
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 222
	Prediction Sentence having neither claim nor premise: 304
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1258
	Sentences having claim in only one of reference or prediction: 537
	Sentences having premise in both reference and prediction: 1338
	Sentences having premise in only one of reference or prediction: 457
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(93.0757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.3157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.2126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.0828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.1252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.2570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.6934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.8210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.7151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.1081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.1978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.1440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.7570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.3169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.7464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.6937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.8634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.2898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.7074, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.0563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.8389, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.4404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.9680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.6028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.9677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.4300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.6295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.4209, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3559888579387187
Sentence level Krippendorff's alpha for Premises:  0.5086350974930363
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 841
	Prediction sentences having premises: 833
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 217
	Prediction Sentence having neither claim nor premise: 338
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1217
	Sentences having claim in only one of reference or prediction: 578
	Sentences having premise in both reference and prediction: 1354
	Sentences having premise in only one of reference or prediction: 441
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(71.7170, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.0707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.5638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.4702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.3498, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.4477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.6148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(339.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.9424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.5606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.1130, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.0585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.3103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.3078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.7076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.9236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.6418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4821, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.4826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.0659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.5720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.4955, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.3345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.7780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3.3625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.7017, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.9911, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4050139275766017
Sentence level Krippendorff's alpha for Premises:  0.490807799442897
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 729
	Prediction sentences having premises: 967
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 220
	Prediction Sentence having neither claim nor premise: 319
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1261
	Sentences having claim in only one of reference or prediction: 534
	Sentences having premise in both reference and prediction: 1338
	Sentences having premise in only one of reference or prediction: 457
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(57.1839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.5842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.3035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.7874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.6127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.2518, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.6687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.1198, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(105.8904, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.7695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.7067, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7351, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.9044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.0086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.3099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.4693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.8167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.9030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.2195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.1285, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.4246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.2272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2.2820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.2275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.1674, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3827298050139276
Sentence level Krippendorff's alpha for Premises:  0.501949860724234
Additional attributes: 
	Total Sentences: 1795
	Prediction setences having claims: 797
	Prediction sentences having premises: 937
	Reference setences having claims: 703
	Reference sentences having premises: 770


	Prediction Sentence having both claim and premise: 237
	Prediction Sentence having neither claim nor premise: 298
	Reference Sentence having both claim and premise: 146
	Reference Sentence having neither claim nor premise: 468


	Sentences having claim in both reference and prediction: 1241
	Sentences having claim in only one of reference or prediction: 554
	Sentences having premise in both reference and prediction: 1348
	Sentences having premise in only one of reference or prediction: 447
				 Metric computations: None


		-------------RUN 3-----------
			------------EPOCH 1---------------
Loss:  tensor(2652.3521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3175.3574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3715.0720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2082.4070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1190.5710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.0165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1694.7817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(648.8984, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2410.2156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2566.0547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1270.9169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1520.9584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2812.4705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1429.5526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1808.0312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2965.8652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2901.1660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2804.7354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2914.1062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2253.1387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1921.7126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1630.6533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1792.9625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1451.7869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2130.8906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1335.1714, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1409.5449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1993.6237, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2108.6667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1200.5444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1571.2396, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1920.1899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1477.2209, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.11065573770491799
Sentence level Krippendorff's alpha for Premises:  0.13729508196721307
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 1008
	Prediction sentences having premises: 71
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 20
	Prediction Sentence having neither claim nor premise: 893
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1084
	Sentences having claim in only one of reference or prediction: 868
	Sentences having premise in both reference and prediction: 1110
	Sentences having premise in only one of reference or prediction: 842
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2008.8069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2491.2810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2861.7554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1586.0625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(951.0931, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.9427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1328.8246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.8829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1866.2433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2126.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1098.9866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1264.5443, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2364.8767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1144.1294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1526.6243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2642.5298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2526.8669, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2511.8474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2540.2185, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1990.0333, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1516.5476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1259.9956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1587.0229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1268.7048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1949.9131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1231.1965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1173.7292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1657.6182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1794.1868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1059.3652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1341.1587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1678.5144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1258.0510, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.33709016393442626
Sentence level Krippendorff's alpha for Premises:  0.35860655737704916
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 437
	Prediction sentences having premises: 675
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 81
	Prediction Sentence having neither claim nor premise: 921
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1305
	Sentences having claim in only one of reference or prediction: 647
	Sentences having premise in both reference and prediction: 1326
	Sentences having premise in only one of reference or prediction: 626
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1824.0359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2243.0620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2496.6157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1460.7374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.1985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(591.8704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1161.3940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(419.8624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1590.3479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1872.0046, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(991.0445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1132.8617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2018.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.7461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1332.9506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2282.2996, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2138.1472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2211.6394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2224.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1768.4236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1192.7610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1001.2476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1399.3693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1153.3075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1782.5823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1153.4043, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.7144, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1336.3806, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1493.8992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(905.3751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.8679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.7277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.0442, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.35860655737704916
Sentence level Krippendorff's alpha for Premises:  0.451844262295082
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 510
	Prediction sentences having premises: 974
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 140
	Prediction Sentence having neither claim nor premise: 608
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1326
	Sentences having claim in only one of reference or prediction: 626
	Sentences having premise in both reference and prediction: 1417
	Sentences having premise in only one of reference or prediction: 535
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1478.2614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1904.3428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2038.2908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1186.5566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.6662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.2616, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(938.4181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(318.7554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.5785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1695.5999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.6171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1015.8956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1747.2072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.2162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1167.8865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1850.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1690.4708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1846.8962, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1950.0392, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1509.6951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(875.5723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.5405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.7151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.0802, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.5188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.9762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.5813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1016.5048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.1404, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(696.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.4055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(955.5649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.7753, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3903688524590164
Sentence level Krippendorff's alpha for Premises:  0.4456967213114754
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 727
	Prediction sentences having premises: 926
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 161
	Prediction Sentence having neither claim nor premise: 460
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1357
	Sentences having claim in only one of reference or prediction: 595
	Sentences having premise in both reference and prediction: 1411
	Sentences having premise in only one of reference or prediction: 541
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1219.0637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1702.2980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1823.8184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(989.3390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.4564, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(385.3633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.2883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.5005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1137.7465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1396.6730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(731.4254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.8865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1537.1123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(450.9343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.8849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1442.9574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1359.4064, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1602.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1719.3477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1227.0898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(614.0713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(542.3295, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(860.5180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(812.1236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1124.2998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(749.9680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.0623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.2630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(897.7684, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(521.1725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(466.4395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(679.8506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.4995, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3698770491803278
Sentence level Krippendorff's alpha for Premises:  0.4252049180327869
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 717
	Prediction sentences having premises: 1052
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 171
	Prediction Sentence having neither claim nor premise: 354
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1337
	Sentences having claim in only one of reference or prediction: 615
	Sentences having premise in both reference and prediction: 1391
	Sentences having premise in only one of reference or prediction: 561
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1069.7252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1460.8574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1499.1619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.9258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.6511, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.8265, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(570.5132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.7592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(900.0848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1264.3137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(584.1974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.5548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1341.9895, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.5939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(735.4973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.7435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(951.1923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1175.8579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1356.9236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1074.3060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(470.9482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(449.3334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.9154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.7559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1078.4570, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.0405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(590.1566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.8088, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.4793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(384.2465, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.1750, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(653.2568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.8866, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3965163934426229
Sentence level Krippendorff's alpha for Premises:  0.45594262295081966
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 447
	Prediction sentences having premises: 1198
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 135
	Prediction Sentence having neither claim nor premise: 442
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1363
	Sentences having claim in only one of reference or prediction: 589
	Sentences having premise in both reference and prediction: 1421
	Sentences having premise in only one of reference or prediction: 531
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(927.8796, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1192.3390, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1197.9058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(641.4905, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.2784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.5461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.0081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.7131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.2354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1223.5908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(478.0312, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.2321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1139.9386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.0615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1257.3079, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(828.6875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.6436, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1198.9761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1055.6609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(344.5173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.1692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.7045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(579.1075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(770.0156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(482.9018, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(335.9235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.7442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.5410, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(286.0556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(259.8395, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(398.7150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.9272, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40676229508196726
Sentence level Krippendorff's alpha for Premises:  0.42725409836065575
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 469
	Prediction sentences having premises: 612
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 101
	Prediction Sentence having neither claim nor premise: 972
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1373
	Sentences having claim in only one of reference or prediction: 579
	Sentences having premise in both reference and prediction: 1393
	Sentences having premise in only one of reference or prediction: 559
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(1349.9338, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2318.3188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2299.9316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1043.8875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(349.9278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.3280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.4969, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(915.2368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1104.6672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.1147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.4416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1426.5035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.6538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(596.3585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1206.6418, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(898.2900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1147.9764, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1281.4055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1093.1650, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(356.7376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.9033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.1499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(551.0402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.8809, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(585.0754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.3641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.5223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(652.6914, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(421.6113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(340.8113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.4231, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.2277, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.1526639344262295
Sentence level Krippendorff's alpha for Premises:  0.28995901639344257
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 1389
	Prediction sentences having premises: 290
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 123
	Prediction Sentence having neither claim nor premise: 396
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1125
	Sentences having claim in only one of reference or prediction: 827
	Sentences having premise in both reference and prediction: 1259
	Sentences having premise in only one of reference or prediction: 693
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(1067.8425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1519.5703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.5365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(868.1287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.1857, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.8810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.3761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.1178, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(586.7717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.1056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.7383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.3195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(919.7162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(207.9352, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(476.4749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(815.1595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(818.3290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.6571, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1201.7444, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(842.1549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.2318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(698.2400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(930.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(674.6147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(937.6322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(734.8686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.7168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(536.2925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(501.4384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.0332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.3003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(455.0394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.8619, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40676229508196726
Sentence level Krippendorff's alpha for Premises:  0.4733606557377049
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 729
	Prediction sentences having premises: 897
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 196
	Prediction Sentence having neither claim nor premise: 522
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1373
	Sentences having claim in only one of reference or prediction: 579
	Sentences having premise in both reference and prediction: 1438
	Sentences having premise in only one of reference or prediction: 514
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(516.9049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(992.7261, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.9126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.5692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.7057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.4233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(336.3838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(499.4911, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.8612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.8342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(413.2107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1225.1345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.5738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.2147, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.7371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(767.2137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(978.4898, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1171.4708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.5267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.1811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.5054, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(399.0628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(619.2772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.7643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.2445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.7275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.5409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.9913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(220.2978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.4315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.1507, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41495901639344257
Sentence level Krippendorff's alpha for Premises:  0.3831967213114754
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 229
	Prediction sentences having premises: 1395
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 83
	Prediction Sentence having neither claim nor premise: 411
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1381
	Sentences having claim in only one of reference or prediction: 571
	Sentences having premise in both reference and prediction: 1350
	Sentences having premise in only one of reference or prediction: 602
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(1234.4561, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1723.2142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1497.0659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.0547, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.1367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.7778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(439.3790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.0421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(492.0790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(677.1262, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.3127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(738.0706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.5111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.9459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.1719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.2682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.3839, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.4125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.8851, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.1835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.2607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(368.4321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.2343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(534.3126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.2274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.6619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.9534, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.6644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.7335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.8519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(206.8658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.0529, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32991803278688525
Sentence level Krippendorff's alpha for Premises:  0.40573770491803274
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 1132
	Prediction sentences having premises: 497
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 155
	Prediction Sentence having neither claim nor premise: 478
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1298
	Sentences having claim in only one of reference or prediction: 654
	Sentences having premise in both reference and prediction: 1372
	Sentences having premise in only one of reference or prediction: 580
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(304.2225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(782.1582, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(821.7844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.2844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.8085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.4918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.2056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.6367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.2201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(630.7875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.9176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.1833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(636.4563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.2843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.0864, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.7908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.1991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.3122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(701.3148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(494.8020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(132.6625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.4463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.1411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.2168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(372.9388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(271.6913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.9116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.4359, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.9531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.5499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.0414, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(174.8989, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.45389344262295084
Sentence level Krippendorff's alpha for Premises:  0.45389344262295084
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 411
	Prediction sentences having premises: 1202
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 130
	Prediction Sentence having neither claim nor premise: 469
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1419
	Sentences having claim in only one of reference or prediction: 533
	Sentences having premise in both reference and prediction: 1419
	Sentences having premise in only one of reference or prediction: 533
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(197.7307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.1340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.9526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(300.4282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.5614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(149.7566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.9588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.8575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(304.4322, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.4505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.7884, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.0378, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(694.1517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.5617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(218.2218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.7868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(232.2349, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.0588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.1537, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.0706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.5337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.9707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.5086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(183.9755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.0199, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.7406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.1625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.1595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(210.7593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.2702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0815, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.2466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.3135, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3770491803278688
Sentence level Krippendorff's alpha for Premises:  0.45594262295081966
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 934
	Prediction sentences having premises: 730
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 191
	Prediction Sentence having neither claim nor premise: 479
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1344
	Sentences having claim in only one of reference or prediction: 608
	Sentences having premise in both reference and prediction: 1421
	Sentences having premise in only one of reference or prediction: 531
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(157.0297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(600.0287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(564.0183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.7697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.8049, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.9987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.2384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.8780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(420.3013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.1785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.8509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(550.2923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.5028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.1227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.1386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.1007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.3606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(560.0496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.0519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.6875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.8580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.9195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.7171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.0314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(161.9513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.8360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.8974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.8667, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.0829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8032, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.0594, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4170081967213115
Sentence level Krippendorff's alpha for Premises:  0.47438524590163933
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 575
	Prediction sentences having premises: 1076
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 167
	Prediction Sentence having neither claim nor premise: 468
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1383
	Sentences having claim in only one of reference or prediction: 569
	Sentences having premise in both reference and prediction: 1439
	Sentences having premise in only one of reference or prediction: 513
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(135.2987, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.3277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.5363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.2703, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.3966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.8350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(148.0868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.7777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.4026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.7767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.8989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.9784, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(477.6652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.8599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.8096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.2515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.4975, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.5475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.9171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(307.6405, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.3337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.7429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.3929, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.1619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.7731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.3592, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.4207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.8705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.0358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.6581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.6858, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.9458, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41290983606557374
Sentence level Krippendorff's alpha for Premises:  0.4661885245901639
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 601
	Prediction sentences having premises: 1064
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 474
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1379
	Sentences having claim in only one of reference or prediction: 573
	Sentences having premise in both reference and prediction: 1431
	Sentences having premise in only one of reference or prediction: 521
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(104.0641, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.8557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(424.8903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.1696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.8530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.7057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(122.5339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(11.6819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.4697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.2659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.1036, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.9599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.8860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.1376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.6770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.8084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.3042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.4387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.8622, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.0591, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.9005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.8472, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(121.4471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.7754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.6681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.6733, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.0925, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.5997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.0876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.2957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.8087, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3811475409836066
Sentence level Krippendorff's alpha for Premises:  0.4713114754098361
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 748
	Prediction sentences having premises: 939
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 202
	Prediction Sentence having neither claim nor premise: 467
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1348
	Sentences having claim in only one of reference or prediction: 604
	Sentences having premise in both reference and prediction: 1436
	Sentences having premise in only one of reference or prediction: 516
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(87.3181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(392.2729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.7950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.9850, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.9689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.4998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.3247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.0461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.9940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.6520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.3031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(330.1403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0499, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.3286, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.0638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.4745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.1171, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.9998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.1091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.0181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.9152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.7508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.2372, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.2393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.2918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.5076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.8728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6727, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.6346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.2607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.8172, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41495901639344257
Sentence level Krippendorff's alpha for Premises:  0.4795081967213115
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 641
	Prediction sentences having premises: 1023
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 475
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1381
	Sentences having claim in only one of reference or prediction: 571
	Sentences having premise in both reference and prediction: 1444
	Sentences having premise in only one of reference or prediction: 508
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(63.9585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(346.5800, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.1768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.9701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.0855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.6554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.2409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(6.5810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(94.9934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.4717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9083, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.8886, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6037, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.4804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.0963, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(251.2459, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.8743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.1941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.0273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.6366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.5583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.3195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.4685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.7763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.6961, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.5690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.8296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.2631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.8587, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3903688524590164
Sentence level Krippendorff's alpha for Premises:  0.46209016393442626
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 699
	Prediction sentences having premises: 998
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 198
	Prediction Sentence having neither claim nor premise: 453
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1357
	Sentences having claim in only one of reference or prediction: 595
	Sentences having premise in both reference and prediction: 1427
	Sentences having premise in only one of reference or prediction: 525
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(49.3602, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.2660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.5343, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6566, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.6918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.7919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.2640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.8640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.7477, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.8140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(276.0429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.1562, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.9300, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.1883, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.4870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.3885, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8679, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.7321, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4447, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.6342, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.5681, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.3363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1041, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9093, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.6469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.5496, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.2770, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.39139344262295084
Sentence level Krippendorff's alpha for Premises:  0.4702868852459017
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 678
	Prediction sentences having premises: 1000
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 190
	Prediction Sentence having neither claim nor premise: 464
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1358
	Sentences having claim in only one of reference or prediction: 594
	Sentences having premise in both reference and prediction: 1435
	Sentences having premise in only one of reference or prediction: 517
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(42.7376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(310.9191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(391.3060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.8862, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.9040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.1195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.1463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(4.7471, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.1618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.2823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.7875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.5297, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.5094, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(21.5325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.0874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.5200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.3867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.3576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.9526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.0481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.4954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4061, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.3224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.7003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.0052, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.5829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(96.1625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.2187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.8457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.0960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8035, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.40573770491803274
Sentence level Krippendorff's alpha for Premises:  0.4579918032786885
Additional attributes: 
	Total Sentences: 1952
	Prediction setences having claims: 684
	Prediction sentences having premises: 1020
	Reference setences having claims: 724
	Reference sentences having premises: 849


	Prediction Sentence having both claim and premise: 182
	Prediction Sentence having neither claim nor premise: 430
	Reference Sentence having both claim and premise: 153
	Reference Sentence having neither claim nor premise: 532


	Sentences having claim in both reference and prediction: 1372
	Sentences having claim in only one of reference or prediction: 580
	Sentences having premise in both reference and prediction: 1423
	Sentences having premise in only one of reference or prediction: 529
				 Metric computations: None


		-------------RUN 4-----------
			------------EPOCH 1---------------
Loss:  tensor(2714.6675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2067.0662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2505.3267, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2575.5708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2961.7288, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2299.7034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3339.2822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2787.2058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1221.9565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2431.1919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2515.3599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.4303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1835.1428, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1973.9744, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.3060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.7302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1674.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2226.8413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(921.2073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1903.9099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1893.3630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2449.2771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2788.9729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1949.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1638.4246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.0698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.5220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.6522, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.4517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2143.8694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.5432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1570.3245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1020.0097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1344.2749, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2011.3662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2447.6804, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2531.1184, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.24292452830188682
Sentence level Krippendorff's alpha for Premises:  0.11438679245283023
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 0
	Prediction sentences having premises: 0
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 0
	Prediction Sentence having neither claim nor premise: 1696
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1054
	Sentences having claim in only one of reference or prediction: 642
	Sentences having premise in both reference and prediction: 945
	Sentences having premise in only one of reference or prediction: 751
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(2057.5615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1502.3005, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1800.4475, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1707.2246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2318.2505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.7605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2508.2993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2125.8916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(935.8508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1849.6394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1988.8916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1125.4976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1577.1235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1679.1611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.2104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.6270, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1446.5098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1942.9329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(773.4388, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1627.8203, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1689.6476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2167.7686, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2569.1438, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1774.1226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1533.4092, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(995.9508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1069.7965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(554.8630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2053.9624, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2250.6072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(998.3959, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1626.3224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(969.1635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1353.9856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2437.4302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2272.3723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2252.9551, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3054245283018868
Sentence level Krippendorff's alpha for Premises:  0.24292452830188682
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 109
	Prediction sentences having premises: 329
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 11
	Prediction Sentence having neither claim nor premise: 1269
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1107
	Sentences having claim in only one of reference or prediction: 589
	Sentences having premise in both reference and prediction: 1054
	Sentences having premise in only one of reference or prediction: 642
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1693.0201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1287.4580, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1521.5662, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1459.5187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2153.1875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1474.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2203.2222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1832.3151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(861.6227, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1751.0494, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1880.7588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.1531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1572.2848, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1603.9028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.3186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.9633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1312.4651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1870.2142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.1201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1529.5165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1571.6813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2190.5603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1743.5635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1440.3762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1324.5273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.9536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(926.4166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.8463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1778.7051, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1819.0427, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.7024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1193.8334, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(843.5275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.6831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1935.3384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1750.3040, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1976.1831, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3313679245283019
Sentence level Krippendorff's alpha for Premises:  0.4056603773584906
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 151
	Prediction sentences having premises: 983
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 12
	Prediction Sentence having neither claim nor premise: 574
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1129
	Sentences having claim in only one of reference or prediction: 567
	Sentences having premise in both reference and prediction: 1192
	Sentences having premise in only one of reference or prediction: 504
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(1546.3853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1158.2625, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1382.1992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1276.2754, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1951.5247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1177.2849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1720.9385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1538.3666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(681.6022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1376.1923, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1568.9482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(817.0109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1275.9080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1271.7253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(911.4520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(957.8340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.0745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1532.5175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(575.0855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1242.1445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1443.5745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1898.5400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1628.1587, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1349.7256, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1103.1810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(672.0240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.6677, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.7826, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1552.9717, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1618.5707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(787.9022, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(893.0609, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(659.5559, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.9762, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1525.8538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1481.5991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1672.1945, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32547169811320753
Sentence level Krippendorff's alpha for Premises:  0.40448113207547165
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 124
	Prediction sentences having premises: 894
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 20
	Prediction Sentence having neither claim nor premise: 698
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1124
	Sentences having claim in only one of reference or prediction: 572
	Sentences having premise in both reference and prediction: 1191
	Sentences having premise in only one of reference or prediction: 505
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(1344.4292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.5406, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1180.5966, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1596.7965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(932.0819, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1505.9258, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.7861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(547.5682, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1122.7118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.4468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.2535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1109.0420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1152.6292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.0240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.8568, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(844.0687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1377.3577, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.8488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1042.9731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1406.9856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1674.6913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1415.0908, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1222.9746, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(964.7385, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(531.5327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.1849, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(436.6314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.7299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.3113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(683.5605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.4551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.2999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.2155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1191.5479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1212.9548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.9965, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.32311320754716977
Sentence level Krippendorff's alpha for Premises:  0.42452830188679247
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 132
	Prediction sentences having premises: 965
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 24
	Prediction Sentence having neither claim nor premise: 623
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1122
	Sentences having claim in only one of reference or prediction: 574
	Sentences having premise in both reference and prediction: 1208
	Sentences having premise in only one of reference or prediction: 488
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(1116.0363, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(845.3284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(976.4225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(896.5391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1416.0453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.3813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1351.8276, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.5958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.1879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(917.5080, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.8551, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.7140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.6944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1021.7136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.8546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(764.1995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(736.3831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1213.3694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(380.9601, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(814.1697, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1244.5948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1404.1943, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1134.7012, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(993.8829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.2882, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.1519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.9831, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(388.2525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1061.2303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1245.8501, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.7021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.0611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.2769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.7675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.5970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1022.0507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.6375, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.37735849056603776
Sentence level Krippendorff's alpha for Premises:  0.44339622641509435
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 222
	Prediction sentences having premises: 999
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 58
	Prediction Sentence having neither claim nor premise: 533
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1168
	Sentences having claim in only one of reference or prediction: 528
	Sentences having premise in both reference and prediction: 1224
	Sentences having premise in only one of reference or prediction: 472
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(946.2102, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(752.1775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(819.9314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(649.1527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1091.7142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(631.2845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1056.5808, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(985.5001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(361.6998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.2369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.6637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(457.9617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(850.6897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(971.0011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(562.3353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(762.6346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(696.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1255.8057, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.2628, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(711.7118, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.0556, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1251.5787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(930.6173, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(800.8209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(655.6708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.2745, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.1439, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(833.7638, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.0021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(514.3384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.8183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.0479, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.9116, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.5122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.3440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1028.6824, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3915094339622641
Sentence level Krippendorff's alpha for Premises:  0.47759433962264153
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 664
	Prediction sentences having premises: 894
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 162
	Prediction Sentence having neither claim nor premise: 300
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1180
	Sentences having claim in only one of reference or prediction: 516
	Sentences having premise in both reference and prediction: 1253
	Sentences having premise in only one of reference or prediction: 443
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(908.7516, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(769.9766, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(872.0912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.9462, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.6630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(569.8539, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(835.5195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.4071, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(268.9014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(508.2220, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.5504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(315.9507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(676.9505, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(710.2254, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(462.9763, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.0957, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(650.8400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1360.4377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(328.1497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.4354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(806.8770, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1697.1042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1085.8440, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(874.9867, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.4152, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.9124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(269.6515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.2200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.5947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(984.1123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.5769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(311.3021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.6782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(157.7634, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(572.7780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(598.3481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(695.9813, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.31014150943396224
Sentence level Krippendorff's alpha for Premises:  0.4186320754716981
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 835
	Prediction sentences having premises: 676
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 169
	Prediction Sentence having neither claim nor premise: 354
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1111
	Sentences having claim in only one of reference or prediction: 585
	Sentences having premise in both reference and prediction: 1203
	Sentences having premise in only one of reference or prediction: 493
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(689.6530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(637.7302, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(741.0825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(602.1696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1381.7632, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.9998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(970.9101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(836.5771, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(298.0502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.1687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1004.6989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.5483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(654.2657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(574.6788, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(426.0881, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.8213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.7244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1311.1191, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(256.8101, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(713.0197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(808.9332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1753.8448, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1577.7224, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.2772, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(539.3228, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.2150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(219.4480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(169.6536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(571.8027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(742.1208, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.1830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.5664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.0917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.6999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.5259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(706.6425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(827.8252, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3915094339622641
Sentence level Krippendorff's alpha for Premises:  0.347877358490566
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 548
	Prediction sentences having premises: 376
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 59
	Prediction Sentence having neither claim nor premise: 831
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1180
	Sentences having claim in only one of reference or prediction: 516
	Sentences having premise in both reference and prediction: 1143
	Sentences having premise in only one of reference or prediction: 553
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(863.9207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(700.6589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(688.8143, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(610.4785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(959.1599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(611.9281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(904.3326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.0861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.4910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.2366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.3876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(246.8153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.7725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(500.9828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(319.4087, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(444.4651, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(442.3654, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.9880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(437.4164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.9680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1176.7861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(858.4504, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(728.1145, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.2070, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.3468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.1860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.5376, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(809.8128, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(463.3916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.8919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.4374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(117.7142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.2660, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(469.6949, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(498.4467, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.34198113207547165
Sentence level Krippendorff's alpha for Premises:  0.4610849056603774
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 790
	Prediction sentences having premises: 628
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 175
	Prediction Sentence having neither claim nor premise: 453
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1138
	Sentences having claim in only one of reference or prediction: 558
	Sentences having premise in both reference and prediction: 1239
	Sentences having premise in only one of reference or prediction: 457
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(611.7394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(502.8820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.5861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.2309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(739.7724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(447.7729, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.5469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(615.4384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(326.5629, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.2520, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(923.0974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(397.8608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.3060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(708.6487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(365.7457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.0445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.0823, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(760.8368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.7769, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.3222, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(423.0244, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(802.2069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(634.5891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.7612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(376.5215, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.4680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6620, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.8531, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.0238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.8368, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(401.8207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.0869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.5623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.2658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.7188, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(661.1619, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(982.7474, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3702830188679245
Sentence level Krippendorff's alpha for Premises:  0.3915094339622641
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 208
	Prediction sentences having premises: 1195
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 55
	Prediction Sentence having neither claim nor premise: 348
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1162
	Sentences having claim in only one of reference or prediction: 534
	Sentences having premise in both reference and prediction: 1180
	Sentences having premise in only one of reference or prediction: 516
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(763.8247, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.3212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(757.9008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.4558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(633.6902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.9976, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(529.5229, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.5488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.2574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.7374, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.6323, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(159.8636, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.2695, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.2563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(240.5110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.4897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(716.6774, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.4773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(341.8121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(758.5647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1073.9397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(820.1909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(618.6251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(489.1100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.3631, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.0786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.1461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.5469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.3016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(373.0259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.5461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.9778, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(278.4148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(461.5951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.4409, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.41981132075471694
Sentence level Krippendorff's alpha for Premises:  0.47051886792452835
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 472
	Prediction sentences having premises: 948
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 155
	Prediction Sentence having neither claim nor premise: 431
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1204
	Sentences having claim in only one of reference or prediction: 492
	Sentences having premise in both reference and prediction: 1247
	Sentences having premise in only one of reference or prediction: 449
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(425.1661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(343.3328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(306.8155, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(197.8445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(518.1964, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.4701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.9868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(402.7468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.6715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.4487, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(443.9786, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.7119, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(228.2502, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.9575, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.5365, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(721.9757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.1707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.5793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(266.1956, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(740.2787, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(588.7738, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.7162, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.8530, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.9340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.4020, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.3891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.9941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(352.4055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.3384, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(129.8820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.1658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.5096, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(185.6995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.7121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.0045, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42688679245283023
Sentence level Krippendorff's alpha for Premises:  0.4799528301886793
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 576
	Prediction sentences having premises: 910
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 185
	Prediction Sentence having neither claim nor premise: 395
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1210
	Sentences having claim in only one of reference or prediction: 486
	Sentences having premise in both reference and prediction: 1255
	Sentences having premise in only one of reference or prediction: 441
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(343.2805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.4970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(244.2464, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.3315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.6676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.2015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(239.1818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(287.0950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.0661, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.5139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.0298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.2958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.1588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.2903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(127.4226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0098, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.3449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(407.8107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(120.1847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.9468, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.0907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.0715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(445.7896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.7230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.8554, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(70.8652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.5838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3213, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.9723, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.7608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(194.7573, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.1644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.9822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.6737, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(172.6278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.6879, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4304245283018868
Sentence level Krippendorff's alpha for Premises:  0.5047169811320755
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 565
	Prediction sentences having premises: 901
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 165
	Prediction Sentence having neither claim nor premise: 395
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1213
	Sentences having claim in only one of reference or prediction: 483
	Sentences having premise in both reference and prediction: 1276
	Sentences having premise in only one of reference or prediction: 420
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(278.4413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.4880, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(187.6691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.2367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(351.3136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(170.6503, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.2007, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(242.1242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.8435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8715, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.5358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(86.6951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(245.9605, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.4315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.4210, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(137.5535, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.5526, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.5029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.5314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.1062, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.3319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(481.6140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.6974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.5965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.3250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.0980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.6618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.1548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.8033, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.4292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(90.1979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6773, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.7047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(109.3072, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(125.6476, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(93.6536, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4339622641509434
Sentence level Krippendorff's alpha for Premises:  0.5117924528301887
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 556
	Prediction sentences having premises: 941
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 180
	Prediction Sentence having neither claim nor premise: 379
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1216
	Sentences having claim in only one of reference or prediction: 480
	Sentences having premise in both reference and prediction: 1282
	Sentences having premise in only one of reference or prediction: 414
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(235.6709, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.6013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.6617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.9221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(280.8391, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.4445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.1533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.0676, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.8318, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.4483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.0578, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(184.0069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.9656, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.1028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.8245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.7028, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.2610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.5525, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.2603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.6497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(416.7585, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(316.8856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.7169, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.7290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.3910, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.1382, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.3891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(195.2426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.8870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.0630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.5989, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.6357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.7687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.5421, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.8689, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4410377358490566
Sentence level Krippendorff's alpha for Premises:  0.5153301886792453
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 608
	Prediction sentences having premises: 902
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 189
	Prediction Sentence having neither claim nor premise: 375
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1222
	Sentences having claim in only one of reference or prediction: 474
	Sentences having premise in both reference and prediction: 1285
	Sentences having premise in only one of reference or prediction: 411
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(211.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(152.0720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.3611, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(222.8513, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.4252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(111.1901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.1612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.0642, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.1456, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.2435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.4780, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.5317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.8896, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.6546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.6978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(257.0432, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.6206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2488, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.3726, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(379.5105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.4103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(151.4666, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.4458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1458, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.1967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.8211, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.2157, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.9675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.8154, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.3791, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.4469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.9775, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.4056, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.0381, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4386792452830188
Sentence level Krippendorff's alpha for Premises:  0.4929245283018868
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 656
	Prediction sentences having premises: 893
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 202
	Prediction Sentence having neither claim nor premise: 349
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1220
	Sentences having claim in only one of reference or prediction: 476
	Sentences having premise in both reference and prediction: 1266
	Sentences having premise in only one of reference or prediction: 430
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(208.8637, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.0207, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.2721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.5172, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(199.9403, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(82.0127, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.1859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(162.5944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(48.6245, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(34.2480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.7375, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.4345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(112.7606, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.2021, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.4543, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.1552, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(231.3330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.1025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.2081, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(371.3546, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(275.9742, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.8006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(85.8649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.5200, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.7469, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(9.5702, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(84.1209, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(143.8528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.7291, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.0263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(16.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.8828, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.5123, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.9300, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.44339622641509435
Sentence level Krippendorff's alpha for Premises:  0.4834905660377359
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 538
	Prediction sentences having premises: 967
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 176
	Prediction Sentence having neither claim nor premise: 367
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1224
	Sentences having claim in only one of reference or prediction: 472
	Sentences having premise in both reference and prediction: 1258
	Sentences having premise in only one of reference or prediction: 438
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(164.0743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.6607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.0643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.4153, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(74.8972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(177.6618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(37.3659, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.4900, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(78.7519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1111, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.0761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.6993, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.7797, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.3647, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.2603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(215.5768, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.5412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.5257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.0760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(348.0920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.5941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.5965, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.8952, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.5000, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.7866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(7.1253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.4705, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.2698, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.5249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.6248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.3329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.0139, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.0115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.8023, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.4038, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4221698113207547
Sentence level Krippendorff's alpha for Premises:  0.48231132075471694
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 756
	Prediction sentences having premises: 826
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 231
	Prediction Sentence having neither claim nor premise: 345
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1206
	Sentences having claim in only one of reference or prediction: 490
	Sentences having premise in both reference and prediction: 1257
	Sentences having premise in only one of reference or prediction: 439
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(168.6790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(102.5466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.2657, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(103.8335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(281.4423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.8607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.6743, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(166.3694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.8725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.0107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(68.8429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.1615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.0985, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(35.6838, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.1630, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.9893, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(44.6879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(175.2842, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.2865, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.9939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.1069, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(285.8939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.2330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(116.3163, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(61.9251, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.4890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.9358, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(5.4027, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8379, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.2008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.0968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.1583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(14.4294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(10.4175, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.2689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.3273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6171, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4528301886792453
Sentence level Krippendorff's alpha for Premises:  0.47523584905660377
Additional attributes: 
	Total Sentences: 1696
	Prediction setences having claims: 532
	Prediction sentences having premises: 988
	Reference setences having claims: 642
	Reference sentences having premises: 751


	Prediction Sentence having both claim and premise: 164
	Prediction Sentence having neither claim nor premise: 340
	Reference Sentence having both claim and premise: 135
	Reference Sentence having neither claim nor premise: 438


	Sentences having claim in both reference and prediction: 1232
	Sentences having claim in only one of reference or prediction: 464
	Sentences having premise in both reference and prediction: 1251
	Sentences having premise in only one of reference or prediction: 445
				 Metric computations: None


		-------------RUN 5-----------
			------------EPOCH 1---------------
Loss:  tensor(1823.6136, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1064.5640, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2511.9612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3286.3442, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1193.2131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2385.8872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1565.8148, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2467.6001, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2361.6008, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2631.5732, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1964.2151, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2678.0779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2125.3257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(3108.1182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1193.6549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1658.5232, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2629.4272, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2182.6091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1614.5073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2064.1980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1183.9722, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1246.0565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1166.4248, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(801.5734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1483.3348, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2509.9980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1573.3048, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1926.9648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1711.7413, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2069.2290, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1453.2107, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1759.0242, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1709.7339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2587.5483, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.18499218342886925
Sentence level Krippendorff's alpha for Premises:  0.22876498176133397
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 223
	Prediction sentences having premises: 48
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 23
	Prediction Sentence having neither claim nor premise: 1671
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1137
	Sentences having claim in only one of reference or prediction: 782
	Sentences having premise in both reference and prediction: 1179
	Sentences having premise in only one of reference or prediction: 740
				 Metric computations: None
			------------EPOCH 2---------------
Loss:  tensor(1071.8721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(789.2829, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1457.9834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2356.9243, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.1907, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1935.4003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1218.3588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2110.3799, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2039.8871, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2284.7747, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1552.5260, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2297.3967, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1864.7847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2769.3091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1045.5029, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1391.6517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2495.5264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2158.9941, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1558.5515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2060.3892, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.7950, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1059.7859, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1000.3868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(697.1678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1270.6309, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2150.5874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1233.5190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1637.7091, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1541.1897, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1897.0508, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1261.8047, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.5142, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.8167, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2164.9443, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2516935904116727
Sentence level Krippendorff's alpha for Premises:  0.3673788431474726
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 363
	Prediction sentences having premises: 1123
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 64
	Prediction Sentence having neither claim nor premise: 497
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1201
	Sentences having claim in only one of reference or prediction: 718
	Sentences having premise in both reference and prediction: 1312
	Sentences having premise in only one of reference or prediction: 607
				 Metric computations: None
			------------EPOCH 3---------------
Loss:  tensor(1064.2644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.6536, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1403.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2023.7981, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(877.3131, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1703.0968, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1117.9187, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2075.9541, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2134.8445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2229.1812, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1350.1394, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2317.0225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1518.2263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2369.6006, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(922.3954, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1283.2869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2212.9756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1958.0658, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1172.5938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1838.9519, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(903.1725, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(936.3633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(894.1190, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(612.0927, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1120.6387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1780.9805, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1060.0330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1338.8569, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1295.4790, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1656.3696, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.1721, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1247.1223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1444.4435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2345.5791, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.293381969775925
Sentence level Krippendorff's alpha for Premises:  0.40281396560708704
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 663
	Prediction sentences having premises: 1039
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 170
	Prediction Sentence having neither claim nor premise: 387
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1241
	Sentences having claim in only one of reference or prediction: 678
	Sentences having premise in both reference and prediction: 1346
	Sentences having premise in only one of reference or prediction: 573
				 Metric computations: None
			------------EPOCH 4---------------
Loss:  tensor(880.9084, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.4572, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1202.6902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1861.0935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(666.1182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1420.1016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.3400, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1735.5875, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1703.1016, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1785.7708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1131.8940, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1931.1707, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1300.2031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(2068.2053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(754.0917, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(996.4844, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1987.1113, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1747.3241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1059.9347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1645.0803, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.1785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.7103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(756.1346, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(528.5549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1005.1024, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1656.5250, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.8416, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1159.8901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.3201, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1417.9827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(876.7134, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1067.8761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1136.8623, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1820.1322, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3955184992183429
Sentence level Krippendorff's alpha for Premises:  0.37780093798853565
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 479
	Prediction sentences having premises: 1175
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 182
	Prediction Sentence having neither claim nor premise: 447
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1339
	Sentences having claim in only one of reference or prediction: 580
	Sentences having premise in both reference and prediction: 1322
	Sentences having premise in only one of reference or prediction: 597
				 Metric computations: None
			------------EPOCH 5---------------
Loss:  tensor(769.4289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.7076, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1084.3497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1738.0369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.6691, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1223.0161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(883.7218, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.6106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1526.9132, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1555.6930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(891.5970, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1735.2451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1118.9792, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1798.0913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.4874, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(778.3115, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1813.5934, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1624.9355, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(862.1478, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1508.4264, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.4316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(717.2319, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(606.1347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(441.6777, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(879.4793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1507.4783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(813.0524, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1026.5604, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(949.3579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1194.2356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(689.1241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(871.3980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(975.2194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1583.1233, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4288692027097447
Sentence level Krippendorff's alpha for Premises:  0.4215737363210005
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 625
	Prediction sentences having premises: 1065
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 197
	Prediction Sentence having neither claim nor premise: 426
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1371
	Sentences having claim in only one of reference or prediction: 548
	Sentences having premise in both reference and prediction: 1364
	Sentences having premise in only one of reference or prediction: 555
				 Metric computations: None
			------------EPOCH 6---------------
Loss:  tensor(619.7278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(429.3683, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(942.1978, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1445.9055, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.0289, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(975.4515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.2278, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1363.1614, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1339.0371, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1340.5708, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(766.2946, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1654.1149, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(932.4240, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1598.7673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.3182, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(692.4879, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1495.4045, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1369.6482, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(592.0325, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1181.6296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(608.2827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(559.7100, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(454.1296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(329.0168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(737.7687, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1314.7515, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(673.6310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(834.2330, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(804.5710, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(977.1680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(561.7712, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(730.4434, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(884.8521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1545.1726, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.3434080250130276
Sentence level Krippendorff's alpha for Premises:  0.416362688900469
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 1083
	Prediction sentences having premises: 814
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 224
	Prediction Sentence having neither claim nor premise: 246
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1289
	Sentences having claim in only one of reference or prediction: 630
	Sentences having premise in both reference and prediction: 1359
	Sentences having premise in only one of reference or prediction: 560
				 Metric computations: None
			------------EPOCH 7---------------
Loss:  tensor(554.8673, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(364.0593, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(849.6919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1261.6685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.9493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.4121, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(669.7939, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1185.4221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1071.4944, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1078.1257, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(609.6204, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1169.4122, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(722.8259, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1388.3445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(438.5983, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(573.1221, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1292.9296, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1291.3843, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(460.1246, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1010.9435, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(620.3140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(496.8997, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(410.0360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(284.0314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(646.0767, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1170.7781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(617.8860, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(693.0621, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.2367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(687.4159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(425.4307, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(543.6716, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(664.7902, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1149.9512, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.2516935904116727
Sentence level Krippendorff's alpha for Premises:  0.41844710786868156
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 1255
	Prediction sentences having premises: 532
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 139
	Prediction Sentence having neither claim nor premise: 271
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1201
	Sentences having claim in only one of reference or prediction: 718
	Sentences having premise in both reference and prediction: 1361
	Sentences having premise in only one of reference or prediction: 558
				 Metric computations: None
			------------EPOCH 8---------------
Loss:  tensor(488.5316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.8532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1054.7183, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(305.6618, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(658.3830, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.1979, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1066.7845, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(945.5701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1032.4916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(568.7426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(994.3899, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(643.2635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1256.8126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(322.0150, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(323.5933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1063.5581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1065.1412, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(414.6603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(796.4834, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(548.2380, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(367.8782, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(289.5887, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(190.1393, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(613.8306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1148.8206, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(540.7281, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(621.8665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(594.1141, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(623.1397, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(411.1194, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(576.2506, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(682.3868, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1081.9545, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.42678478374153206
Sentence level Krippendorff's alpha for Premises:  0.416362688900469
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 489
	Prediction sentences having premises: 1170
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 187
	Prediction Sentence having neither claim nor premise: 447
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1369
	Sentences having claim in only one of reference or prediction: 550
	Sentences having premise in both reference and prediction: 1359
	Sentences having premise in only one of reference or prediction: 560
				 Metric computations: None
			------------EPOCH 9---------------
Loss:  tensor(400.9425, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.5168, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.5756, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1270.2275, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(264.2104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(516.8548, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.5563, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(943.5328, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(678.6433, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.5974, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(405.5612, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(746.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(603.7158, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1144.0104, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(366.7866, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(295.1263, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1012.7159, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(832.0103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(400.2615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(748.3470, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.9086, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(381.5817, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(253.5058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.0603, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(527.1481, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1009.6108, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(357.7197, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(535.0233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(537.1042, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(532.4694, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(389.3533, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(506.7030, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.3912, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1119.6456, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4007295466388744
Sentence level Krippendorff's alpha for Premises:  0.4319958311620635
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 774
	Prediction sentences having premises: 1089
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 251
	Prediction Sentence having neither claim nor premise: 307
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1344
	Sentences having claim in only one of reference or prediction: 575
	Sentences having premise in both reference and prediction: 1374
	Sentences having premise in only one of reference or prediction: 545
				 Metric computations: None
			------------EPOCH 10---------------
Loss:  tensor(204.4437, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.1370, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(519.7424, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(802.8933, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.1649, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.0855, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(309.5497, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(651.0059, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(541.4584, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(616.8165, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(363.4730, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.3930, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(427.6073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(936.6332, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(248.0932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(212.0314, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(746.4401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(667.0430, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(279.2574, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.1445, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(350.3449, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(214.7813, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.1920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(118.8135, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.6058, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(726.8663, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(255.5781, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.1678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(377.4692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(358.7685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(196.0610, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.8872, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(464.0693, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(715.8481, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.45179781136008335
Sentence level Krippendorff's alpha for Premises:  0.32777488275143307
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 481
	Prediction sentences having premises: 1307
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 175
	Prediction Sentence having neither claim nor premise: 306
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1393
	Sentences having claim in only one of reference or prediction: 526
	Sentences having premise in both reference and prediction: 1274
	Sentences having premise in only one of reference or prediction: 645
				 Metric computations: None
			------------EPOCH 11---------------
Loss:  tensor(173.2075, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(202.7560, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(456.1341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(733.4973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.3938, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(302.8909, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(234.1252, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(558.2253, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.7234, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(509.8613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.8783, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(504.2034, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.0327, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(714.6174, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(189.9303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(134.3720, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(719.2025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(512.6417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.4890, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(451.7549, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(386.4347, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(354.5835, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.2181, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(171.0137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(473.7600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1157.1266, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(458.1357, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(595.2313, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(292.2853, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.2097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.6038, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(362.1847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(549.5005, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4330380406461699
Sentence level Krippendorff's alpha for Premises:  0.2610734757686295
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 339
	Prediction sentences having premises: 1421
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 130
	Prediction Sentence having neither claim nor premise: 289
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1375
	Sentences having claim in only one of reference or prediction: 544
	Sentences having premise in both reference and prediction: 1210
	Sentences having premise in only one of reference or prediction: 709
				 Metric computations: None
			------------EPOCH 12---------------
Loss:  tensor(155.6973, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(203.1360, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(394.9613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(744.1617, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(221.0226, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(488.7903, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(312.5674, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(771.5995, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.3324, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1068.2825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(431.6273, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1051.8035, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(338.7284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(988.9920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(267.0901, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.8751, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(524.6521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(485.9841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(181.5935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(375.8233, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(260.0420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(191.6514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.8025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.2678, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(374.2195, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(908.3331, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.0383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(495.3303, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(725.5951, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(763.5928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(446.9146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(632.3760, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(690.0320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1142.9674, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.23501823866597182
Sentence level Krippendorff's alpha for Premises:  0.3434080250130276
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 1239
	Prediction sentences having premises: 338
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 112
	Prediction Sentence having neither claim nor premise: 454
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1185
	Sentences having claim in only one of reference or prediction: 734
	Sentences having premise in both reference and prediction: 1289
	Sentences having premise in only one of reference or prediction: 630
				 Metric computations: None
			------------EPOCH 13---------------
Loss:  tensor(421.7724, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(205.3298, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(794.0068, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(1050.4341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.5972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(408.5816, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(337.7483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(624.2913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.9453, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(566.4928, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(293.4180, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(479.1704, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(273.6652, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(599.9078, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(150.8366, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.1690, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(684.7668, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(670.2523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(229.1589, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(468.7431, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.9935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(217.4680, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(115.2891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(412.1856, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(776.1411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(301.5999, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.6836, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.2588, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(383.3765, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.2279, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.5137, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(483.3461, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(855.0653, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4507556018759771
Sentence level Krippendorff's alpha for Premises:  0.3298593017196456
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 400
	Prediction sentences having premises: 1321
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 144
	Prediction Sentence having neither claim nor premise: 342
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1392
	Sentences having claim in only one of reference or prediction: 527
	Sentences having premise in both reference and prediction: 1276
	Sentences having premise in only one of reference or prediction: 643
				 Metric computations: None
			------------EPOCH 14---------------
Loss:  tensor(124.6728, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(186.5706, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(299.0493, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(628.8339, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(108.9337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.5356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(158.9317, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(428.7739, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(263.8643, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(370.4795, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(230.5474, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(324.1811, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.5103, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(511.6613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(131.0367, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.8216, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(404.9337, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(378.4759, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(164.8833, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(265.7507, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.5060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.1230, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.7166, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.5282, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.6073, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(493.5779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.0837, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(200.6315, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(226.7825, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(216.8761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(91.1105, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(146.5277, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(296.0685, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(435.8954, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.39864512767066185
Sentence level Krippendorff's alpha for Premises:  0.4486711829077644
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 962
	Prediction sentences having premises: 781
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 200
	Prediction Sentence having neither claim nor premise: 376
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1342
	Sentences having claim in only one of reference or prediction: 577
	Sentences having premise in both reference and prediction: 1390
	Sentences having premise in only one of reference or prediction: 529
				 Metric computations: None
			------------EPOCH 15---------------
Loss:  tensor(87.9110, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.1581, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(277.0889, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(515.4972, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.6340, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(163.9003, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(160.8383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(333.8779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(236.6971, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(355.5757, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.5044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(331.0411, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.9810, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(409.4550, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(100.3402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.4097, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(283.0947, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(250.1452, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.0025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(198.8699, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(178.7565, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(80.8124, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(71.6140, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(46.0466, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.0310, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(422.2583, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.7600, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(147.9633, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.1734, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.9455, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.1595, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.8891, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(233.6648, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(332.9089, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48931735278791033
Sentence level Krippendorff's alpha for Premises:  0.3871808233454924
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 533
	Prediction sentences having premises: 1182
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 182
	Prediction Sentence having neither claim nor premise: 386
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1429
	Sentences having claim in only one of reference or prediction: 490
	Sentences having premise in both reference and prediction: 1331
	Sentences having premise in only one of reference or prediction: 588
				 Metric computations: None
			------------EPOCH 16---------------
Loss:  tensor(47.5692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(113.4361, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(138.0350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(369.0402, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.3214, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(104.7426, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(76.5015, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(270.9177, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(153.9249, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(243.5913, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(130.5713, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(168.8847, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(139.0761, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(313.1274, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(62.1044, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.4926, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(247.8613, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(225.0085, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(88.6301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.1060, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(144.6320, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.0689, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.6235, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.5241, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.6353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(345.5386, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8420, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.6870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(110.6894, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(136.1701, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.1335, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(64.8700, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(179.8635, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(262.6652, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.48097967691505994
Sentence level Krippendorff's alpha for Premises:  0.4132360604481501
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 699
	Prediction sentences having premises: 1067
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 205
	Prediction Sentence having neither claim nor premise: 358
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1421
	Sentences having claim in only one of reference or prediction: 498
	Sentences having premise in both reference and prediction: 1356
	Sentences having premise in only one of reference or prediction: 563
				 Metric computations: None
			------------EPOCH 17---------------
Loss:  tensor(34.9039, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(95.4919, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(114.0807, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(288.3958, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.1451, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.3120, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.9517, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(213.5948, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(106.0779, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(211.3719, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.0626, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(135.6184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(98.9350, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(258.7417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.1354, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(31.7099, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(167.6480, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(154.8521, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(69.2002, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.1509, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.3615, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.6793, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(40.0755, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(25.5998, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(133.4514, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(290.8401, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.7077, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.2164, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.3558, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.2011, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.1381, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.3608, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(145.7861, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(227.9407, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.47681083897863474
Sentence level Krippendorff's alpha for Premises:  0.41844710786868156
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 681
	Prediction sentences having premises: 1064
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 209
	Prediction Sentence having neither claim nor premise: 383
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1417
	Sentences having claim in only one of reference or prediction: 502
	Sentences having premise in both reference and prediction: 1361
	Sentences having premise in only one of reference or prediction: 558
				 Metric computations: None
			------------EPOCH 18---------------
Loss:  tensor(23.7345, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.5326, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(81.5785, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(238.9271, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.1876, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(54.9306, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.0906, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(141.6429, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(72.0294, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(176.5212, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(66.8236, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.4599, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.3417, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(235.0126, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.3377, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.0483, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(126.7523, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(123.2299, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(53.9463, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(59.5341, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(83.8025, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(36.6156, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(30.7146, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(19.9287, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9675, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(261.7225, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.0329, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(65.9644, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(57.9369, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(92.5284, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.3280, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1645, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(119.9980, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.2140, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.491401771756123
Sentence level Krippendorff's alpha for Premises:  0.40698280354351224
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 683
	Prediction sentences having premises: 1095
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 214
	Prediction Sentence having neither claim nor premise: 355
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1431
	Sentences having claim in only one of reference or prediction: 488
	Sentences having premise in both reference and prediction: 1350
	Sentences having premise in only one of reference or prediction: 569
				 Metric computations: None
			------------EPOCH 19---------------
Loss:  tensor(18.3935, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(56.9161, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(60.3292, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(209.2692, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.6527, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(43.2026, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.3114, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(128.3818, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(51.5820, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(155.4031, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(50.1936, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(75.3457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(58.0991, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(201.2960, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6918, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.5301, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.6579, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(99.5869, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(45.5415, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.6125, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(67.5576, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(27.1528, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.1316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(13.7538, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(107.3921, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(224.0992, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(22.9238, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(52.4557, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.4184, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(79.9932, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(23.2664, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(24.2598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(101.9665, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(182.4279, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4736842105263158
Sentence level Krippendorff's alpha for Premises:  0.4090672225117249
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 692
	Prediction sentences having premises: 1087
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 221
	Prediction Sentence having neither claim nor premise: 361
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1414
	Sentences having claim in only one of reference or prediction: 505
	Sentences having premise in both reference and prediction: 1352
	Sentences having premise in only one of reference or prediction: 567
				 Metric computations: None
			------------EPOCH 20---------------
Loss:  tensor(14.5916, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(41.5590, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(49.1316, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(173.6988, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(18.2223, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.6423, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.6109, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(124.9598, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(47.1532, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(140.2383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(39.2920, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(63.1186, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(38.8013, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(193.7822, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(26.9870, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(12.9841, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(77.5460, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(89.5731, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(29.7387, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(33.9106, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(55.5672, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(28.4409, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(20.9457, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.2179, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(97.7383, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(204.6014, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(15.7607, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(42.8356, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(32.1937, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(73.0827, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.8176, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(17.5053, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(87.6353, device='cuda:0', grad_fn=<DivBackward0>)
Loss:  tensor(165.0446, device='cuda:0', grad_fn=<DivBackward0>)
Sentence level Krippendorff's alpha for Claims:  0.4788952579468473
Sentence level Krippendorff's alpha for Premises:  0.4132360604481501
Additional attributes: 
	Total Sentences: 1919
	Prediction setences having claims: 627
	Prediction sentences having premises: 1113
	Reference setences having claims: 741
	Reference sentences having premises: 758


	Prediction Sentence having both claim and premise: 206
	Prediction Sentence having neither claim nor premise: 385
	Reference Sentence having both claim and premise: 139
	Reference Sentence having neither claim nor premise: 559


	Sentences having claim in both reference and prediction: 1419
	Sentences having claim in only one of reference or prediction: 500
	Sentences having premise in both reference and prediction: 1356
	Sentences having premise in only one of reference or prediction: 563
				 Metric computations: None
